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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2> BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 

10 lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (ie., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

1 5 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, fc 5 ^ 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences, 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
3 0 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
35 polynucleotides and cells genetically engineered to express such polynucleotides. 
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The present invention relates to a collection or library of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 
The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 
5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 

sequences are designated as SEQ ID NO: 1-1786 and 3573-5358. The polypeptides sequences are 
designated SEQ ID NO: 2n (wherein n = 1 to 20). The nucleic acids and polypeptides are provided 
in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenosine; C is 
cytosine; G is guanine; T is thymine; and N is any of the four bases. In the amino acids provided in 
1 0 the Sequence Listing, * corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO:l-1786 and 3573-5358 under stringent hybridization 
conditions; nucleic acid sequences which are allelic variants or species homologues of any of the 
nucleic acid sequences recited above, or nucleic acid sequences that encode a peptide comprising a 
15 specific domain or truncation ofthe peptides encoded by SEQ ID NO:l-1786 and 3573-5358. A 
polynucleotide comprising a nucleotide sequence having at least 90% identity to an identifying 
sequenceof SEQIDNO:l-1786and 3573-5358 or a degenerate variant or fragment thereof. The 
identifying sequence can be 1 00 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 
20 from the nucleic acid sequences of SEQ ID NO: 1-1786 and 3573-5358 . The sequence information 
can be a segment of any one of SEQ ID NO:l -1 786 and 3573-5358 that uniquely identifies or 
represents the sequence information of SEQ IDNO:l-1786 and 3573-5358. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
25 a nucleic acid array. In one embodiment, segments of sequence information is provided on a 

nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 
to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readableformat. 

This invention also includes the reverse or direct complement of any of the nucleic acid 
30 sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
reverse or direct complements) according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readablemedia, use in sequencing 
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full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DNA or RNA, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO : 1 - 1 786 and 3 573- 
5358 or novel segments or parts of the nucleic acids of the invention are used as primers in 
5 expression assays that are well known in the art. In a particularly preferred embodiment, the nucleic 
acid sequences of SEQIDNO:l-1786and 3573-5358 or novel segments or parts of the nucleic 
acids provided herein are used in diagnostics for identifying expressed genes or, as well known in 
the art and exemplified by Vollrath et al., Science 258:52-59 (1 992), as expressed sequence tags for 
physical mapping of the human genome. 
1 0 The isolated polynucleotides of the invention include, but are not limited to, a 

polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1-1 786 and 
3573-5358; a polynucleotide comprising any of the full length protein coding sequences of SEQ ID 
NO:l -1 786 and 3573-5358; and a polynucleotide comprising any of the nucleotide sequences of the 
mature protein coding sequences of SEQ ID NO: 1 -1 786 and 3573-5358. The polynucleotides of me 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent hybridization conditions to (a) the complement of any one of the nucleotide sequences set 
forth in SEQ ID NO: 1-1786 and 3573-5358; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 
20 (e.g. orthologs) of any of the proteins recitedabove; or (e) a polynucleotide that encodes a 

polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 
amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the corresponding 
25 full length or mature protein. Polypeptides of the invention also include polypeptides with biological 
activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence set forth in 
SEQ IDNO:l-1786 and 3573-5358; or (b) polynucleotides that hybridize to the complement of the 
polynucleotides of (a) under stringent hybridization conditions. Biologically or immunologically 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 
30 equivalents"thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
amino acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides of the invention may be wholly or partially chemically synthesized but are preferably 
produced by recombinant means using the genetically engineered cells (e.g. host cells) of the 
invention. 
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The invention also provides compositions comprising a polypeptide of the invention. 

Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 

hydrophilic, e.g., pharmaceutical^ acceptable, cairier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 

the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 
protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
flapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 
or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 
using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for physical 
mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a polypeptide 
of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 
markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 
which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
expression or biological activity. 
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The present invention further relates to methods for detecting the presence of the 

polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 

utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 

identification of subjects exhibiting a predisposition to such conditions. The invention provides 

5 a method for detecting the polynucleotides of the invention in a sample, comprising contacting 

the sample with a compound that binds to and forms a complex with the polynucleotide of 

interest for a period sufficient to form the complex and under conditions sufficient to form a 

complex and detecting the complex such that if a complex is detected, the polynucleotide of 

interest is detected. The invention also provides a method for detecting the polypeptides of the 

1 0 invention in a sample comprising contacting the sample with a compound that binds to and forms 

a complex with the polypeptide under conditions and for a period sufficient to form the complex 

and detecting the formation of the complex such that if a complex is formed, the polypeptide is 

detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
1 5 antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
20 (i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
25 compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/cornpound 
complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 
detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
30 identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
35 modulate the overall activity of the target gene products. Compounds and other substances can 
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effect such modulation either on the level of target gene/protein expression or target protein 
activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Table 2); for which they have a 
signature region (as set forth in Table 3); or for which they have homology to a gene family (as 
set forth in Table 4). If no homology is set forth for a sequence, then the polypeptides and 
polynucleotides of the present invention are useful for a variety of applications, as described 
herein, including use in arrays for detection. 

4* DETAILED DESCRIPTION OF THE INVENTION 



4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
1 5 "an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
20 Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
25 enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
30 total complementarity exists between the single stranded molecules. The degree of 

complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
35 stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
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and continuous source of germ cells for the production of gametes. The term "primordial germ 

cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 

from the yolk sac, mesenteries, or gonadal ridges during embiyogenesis that have the potential to 

differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 

5 are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 

not only populate the germ line and give rise to a plurality of terminally differentiated cells that 

comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 

modulates the expression of an operably linked ORF or another EMF. 

1 0 As used herein, a sequence is said to "modulate the expression of an operably linked 

sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 

include, but are not limited to, promoters, and promoter modulating sequences (inducible 

elements). One class of EMFs are nucleic acid fragments which induce the expression of an 

operably linked ORF in response to a specific regulatory factor or physiological event. 

1 5 The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 

"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 

sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 

origin which may be single-stranded or double-stranded and may represent the sense or the 

antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like material. In the 

20 sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 

(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 

provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 

invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 

from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 

25 acid which is capable of being expressed in a recombinant transcriptional unit comprising 

regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

the terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 

"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 

residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 

30 more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 

most preferably at least about 1 7 nucleotides. The fragment is preferably less than about 500 

nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 

nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 

nucleotides- Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 

35 preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 
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nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
IDNOs:l-20. 

Probes may, for example, be used to determine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et ai. (Walsh, P.S. et al., 1992, PCR Methods Appl 1 :241-250). They may 
be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
ait Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et ah, 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al, 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO:l-1786 and 3573-5358. The 
sequence information can be a segment of any one of SEQ ID NO:l-1786 and 3573-5358 that 
uniquely identifies or represents the sequence information of that sequence of SEQ ID NO:l- 
1 786 and 3573-5358. One such segment can be a twenty-mer nucleic acid sequence because the 
probability that a twenty-mer is folly matched in the human genome is 1 in 300. In the human 
genome, there are three billion base pairs in one set of chromosomes. Because 4 20 possible 
twenty-mers exist, there are 300 times more twenty-mers than there are base pairs in a set of 
human chromosomes. Using the same analysis, the probability for a seventeen-mer to be fully 
matched in the human genome is approximately 1 in 5. When these segments are used in arrays 
for expression studies, fifteen-mer segments can be used. The probability that the fifteen-mer is 
fully matched in the expressed sequences is also approximately one in five because expressed 
sequences comprise less than approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five men The probability that the twenty-five mer would appear in a human genome 
with a single mismatch is calculated by multiplying the probability for a full match (1*4 25 ) times the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 

8 
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The term "open reading frame," ORF, means a series of nucleotide triplets coding for 
amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 1 7 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 1 50 amino acids and most preferably less than 1 00 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 
length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
caxboxylation, glycosylation, phosphorylation, lipidation and acylation. 

The term "translated protein coding portion" means a sequence which encodes for the full 
length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
produced by processing in the cell which removes any leader/signal sequence. The mature 
protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 
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The term "derivative" refers to polypeptides chemically modified by such techniques as 
ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
in human proteins. 

The term n variant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 
recombinant DNA techniques. Guidance in detennining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 
affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, i.e., conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 
nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
"deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 
amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
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can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
for expression, scale up and the like in the host ceils chosen for expression. For example, 
cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms ,, purifled ,, or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 
polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 
at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 
polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 
defines a polypeptide or protein essentially free of native endogenous substances and 
unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 
or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 
appropriate transcription initiation and teimination sequences. Structural units intended for use 

n 
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in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably integrated 
a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 
to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 
can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include without limitation proteins secreted wholly 
(e.g., soluble proteins) or partially {e.g., receptors) from the cell in which they are expressed. 
"Secreted" proteins also include without limitation proteins that are transported across the 
membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 
proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P.A. and 
Young, P.R. (1992) Cytokine 4(2):134 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1998) Annu. Rev. Immunol. 
16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

The term "stringent" is used to refer to conditions that are commonly understood in the 
art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1X SSC/0.1% SDS at 68°CX and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 
described herein in the examples. 

12 
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In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides). 
5 As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 

sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 

10 35% {i.e., the number of individual residue substitutions, additions, and/or deletions in a 

substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 

1 5 listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 
by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no 
more than 10% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e.g. , mutant, amino acid 

20 sequences according to the invention preferably have at least 80% sequence identity with a listed 
amino acid sequence, more preferably at least 90% sequence identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, and most 

25 preferably at least about 95% identity. For the purposes of the present invention, sequences 
having substantially equivalent biological activity and substantially equivalent expression 
characteristics are considered substantially equivalent. For the purposes of determining 
equivalence, truncation of the mature sequence (e.g., via a mutation which creates a spurious 
stop codon) should be disregarded. Sequence identity may be determined, e.g., using the Jotun 

30 Hein method (Hein, J. (1990) Methods Enzymol. 1 83:626-645). Identity between sequences can 
also be determined by other methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 
35 DNA is replicable, either as an extrachromosomal elemeint, or by chromosomal integration. The 
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term "transfection" refers . to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment,' 1 UMF, means a series of nucleotides 
which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 
marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 

4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 

The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO:l-1786 and 3573-5358 ; a polynucleotide encoding any one 
of the peptide sequences of SEQ ID NO: 1787-3572 and 5359-7144; and a polynucleotide 
comprising the nucleotide sequence encoding the mature protein coding sequence of the 
polypeptides of any one of SEQ ID NO:1787-3572 and 5359-7144. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ ID NO:l- 
1786 and 3573-5358 ; (b) nucleotide sequences encoding any one of the amino acid sequences 
set forth in the Sequence Listing; (c) a polynucleotide which is an allelic variant of any 
polynucleotide recited above; (d) a polynucleotide which encodes a species homolog of any of 
the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 
specific domain or truncation of the polypeptides of SEQ ID NO: 1787-3572 and 5359-7144. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 
receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 
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The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

5 The present invention also provides genes corresponding to the cDNA sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3 r sequence can 

1 0 be obtained using methods known in the art. For example, full length cDNA or genomic DNA that 
corresponds to any of the polynucleotides of SEQ ID NO:M786 and 3573-5358 can be obtained 
by screening appropriate cDNA or genomic DNA libraries under suitable hybridization conditions 
using any of the polynucleotides of SEQ ID NO: 1 - 1 786 and 3573-5358 or a portion thereof as a 
probe. Alternatively, the polynucleotides of SEQ ID NO: 1-1786 and 3573-5358 may be used as the 

1 5 basis for suitable primer(s) that allow identification and/or amplification of genes in appropriate 
genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 

20 representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 

25 75%, at least about 80%, more typically at least about 90%, and even more typically at least 
about 95%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO: 1-1 786 and 3573-5358, or complements thereof, which fragment is greater than 

30 about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and 

most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more 
that are selective for (i.e. specifically hybridize to any one of the polynucleotides of the 
invention) are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention from other polynucleotide sequences in 



15 



WO 01/53312 PCT/US00/34263 

the same family of genes or can differentiate human genes from genes of other species, and are 
preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 
specific sequences, but also include allelic and species variations thereof. Allelic and species 
5 variations can be routinely determined by comparing the sequence provided SEQ ID NO:l-1786 
and 3573-5358, a representative fragment thereof, or a nucleotide sequence at least 90% identical, 
preferably 95% identical, to SEQ ID NO;l-1786 and 3573-5358 with a sequence from another 
isolate of the same species. Furthermore, to accommodate codon variability, the invention includes 
nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed 

1 0 herein. In other words, in the coding region of an ORF, substitution of one codon for another codon 
that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present invention, 
including SEQ ID NO: 1 -1786 and 3573-5358, can be obtained by searching a database using an 
algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment Search Tool 

15 is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 290-300 (1993) and 
Altschul S.F. et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a FAST A version 3 search 
against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 

20 suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 

25 polynucleotides. 

' The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 

30 sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 

35 will typically be modified in series, e.g., by substituting first with conservative choices (e.g., 
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hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 

choices {e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 

may be made at the target site. Amino acid sequence deletions generally range from about 1 to 

30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 

5 insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 

hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 

residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 

preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 

sequences necessary for secretion or for intracellular targeting in different host cells and 

1 0 sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 
In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 

15 site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et al., 
DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 

20 When small amounts of template DNA are used as starting material, primer(s) that differs 

slightly in sequence from the corresponding region in the template DNA can generate the desired 
amino acid variant. PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 

25 gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al., supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 

30 code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 
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Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 
domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 
conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO:l-1786 and 3573-5358, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that direct 
the expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. 
Also included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 
nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 
plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Vectors according to the invention include expression 
vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukary otic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences of SEQ ID NO: 1-1 786 and 3573-5358 or a fragment 
thereof or any other polynucleotides of the invention. In one embodiment, the recombinant 
constructs of the present invention comprise a vector, such as a plasmid or viral vector, into 
which a nucleic acid having any of the nucleotide sequences of SEQ ID NO; 1-1 786 and 3573- 
5358 or a fragment thereof is inserted, in a forward or reverse orientation. In the case of a vector 
comprising one of the ORFs of the present invention, the vector may further comprise regulatory 
sequences, including for example, a promoter, operably linked to the ORF. Large numbers of 
suitable vectors and promoters are known to those of skill in the art and are commercially 
available for generating the recombinant constructs of the present invention. The following 
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vectors are provided by way of example. Bacterial: pBs, phagescript, PsiX174, pBluescript SK, 
pBs KS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); P Trc99A, pKK223-3, pKK233-3, 
pDR540, pRITS (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) 
pSVK3, pBPV, pMSG, pSVL (Pharmacia). 

5 The isolated polynucleotide of the invention may be operably linked to an expression 

control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et al, 
Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many 
suitable expression control sequences are known in the art. General methods of expressing 
recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 

10 Enzymology 1 85, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 

1 5 transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-L Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 

20 Generally, recombinant expression vectors will include origins of replication and selectable 

markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of £ coli 
and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 
transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 

25 phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 

30 characteristics, e.g. , stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors, for bacterial use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 

35 vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
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transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
5 can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 

1 0 sequence to be expressed* Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 
appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

1 5 Polynucleotides of the invention can also be used to induce immune responses. For 

example, as described in Fan et al, Nat Biotech, 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 

20 sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 



4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
25 are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO.1-1786 and 3573-5358, or fragments, analogs or derivatives thereof. 
An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" 
nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded 
cDNA molecule or complementary to an mRNA sequence. In specific aspects, antisense nucleic 
30 acid molecules are provided that comprise a sequence complementary to at least about 10, 25, 
50, 100, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof . Nucleic 
acid molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO: 1787-3572 and 5359-7144 or antisense nucleic acids complementary to a nucleic 
acid sequence of SEQ ID NO:l-1786 and 3573-5358 are additionally provided. 
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In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5* and 3' sequences which flank the coding region that are not 
translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID 
NO: 1-1786 and 3573-5358 , antisense nucleic acids of the invention can be designed according 
to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule 
can be complementary to the entire coding region of a mRNA, but more preferably is an 
oligonucleotide that is antisense to only a portion of the coding or noncoding region of a mRNA. 
For example, the antisense oligonucleotide can be complementary to the region surrounding the 
translation start site of a mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 
15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the inventipn 
can be constructed using chemical synthesis or enzymatic ligation reactions -using procedures 
known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can 
be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase the 
physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., 
phosphorothioate derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine, S'-methoxycarboxymethyluracil, 5-methoxyuracil, 

2- methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl~2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the 

21 



WO 01/53312 PCT/USOO/34263 

inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 

5 genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 
protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 
an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 

10 nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 

1 5 receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 
the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 

20 a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 

double-stranded hybrids with complementary RNA in which, contrary to the usual p-units, the 
strands run parallel to each other (Gaultier et al (1987) Nucleic Acids Res 15: 6625-6641). The 
antisense nucleic acid molecule can also comprise a 2'-o-methylribonucleotide (Inoue et al 
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al, (1987) 

25 FEBS Lett 21 5: 327-330). 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme, 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
30 single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (le., SEQ ID NO:l- 
35 1786 and 3573-5358). For example, a derivative of a Tetrahymena L-19 IVS RNA can be 
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constructed in which the nucleotide sequence of the active site is complementary to the 
nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., Cech et ah U.S. Pat. 
No. 4 7 987 ? 071 ; and Cech et al U.S. Pat. No. 5,1 1 6,742. Alternatively, SECX mRNA can be 
used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA 
molecules. See, e.g., Bartel et al, (1993) Science 261.141 1-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1 991) 
Anticancer Drug Des. 6: 569-84; Helene. etal. (1992) Ann. KY. Acad, Sci. 660:27-36; and 
Maher (1992) Bioassays 14: 807-15. 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al (1996) BioorgMed 
Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al (1996) above; 
Perry-O'Keefe et al. (1996) PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 
PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 
gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 
primers for DNA sequence and hybridization (Hyrup et al. (1 996), above; Perry-O'Keefe (1 996), 
above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance their 
stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
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portion woxild provide high binding affinity and specificity. PNA-DNA chimeras can be linked 

using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 

the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 

can be performed as described in Hyrup (1996) above and Finn et al (1996) Nucl Acids Res 24: 

5 3357-63. For example, a DNA chain can be synthesized on a solid support using standard 

phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 

5'-(4-methoxytrityI)amino-5 -deoxy-thymidine phosphoramidite, can be used between the PNA 

and the 5' end of DNA (Mag et al. (1989) Nucl Acid Res 1 7: 5973-88). PNA monomers are then 

coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' 

10 DNA segment (Finn et al (1996) above). Alternatively, chimeric molecules can be synthesized 

with a 5 f DNA segment and a 3' PNA segment. See, Petersen et al (1975) Bioorg Med Chem 

LettS: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 

15 cell membrane (see, e.g., Letsinger etaL, 1989, Proc. Natl Acad. ScL U.S.A. 86:6553-6556; 
Lemaitre et al, 1987, Proc. Natl. Acad ScL 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 

20 5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 

peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 



4.5 HOSTS 

25 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

30 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 

35 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
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the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. W09 1/09955. It is also contemplated that, in addition to heterologous promoter 
DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 
eukaxyotic host cell, such as a yeast cell, or the host cell canbe aprokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et ah, Basic Methods in Molecular Biology (1986)). The host cells containing one of the 
polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORP) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 
COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 
The most preferred cells are those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, ct 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 
protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 
from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 
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HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the S V40 viral genome, for example, 
5 SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 
10 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 

15 or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida* or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 

20 may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 

25 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 

30 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 
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protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et ah; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et aL; and International Application No. 
PCT/US90/06436 (W09 1/06667) by Skoultchi et al., each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising: the amino acid sequences set forth as any one of SEQ IDNO:1787-3572 and 5359- 
7144 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO:l- 
1 786 and 3573-5358 or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides preferably with biological or immunological activity that are 
encoded by: (a) a polynucleotide having any one of the nucleotide sequences set forth in SEQ ID 
NO:l-1786 and 3573-5358 or (b) polynucleotides encoding any one of the amino acid sequences 
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set forth as SEQ ID NO: 1787-3572 and 5359-7144 or (c) polynucleotides that hybridize to the 

complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 

The invention also provides biologically active or immunologically active variants of any of the 

amino acid sequences set forth as SEQ ID NO: 1787-3572 and 5359-7144 or the corresponding 

5 full length or mature protein; and "substantial equivalents" thereof (e.g., with at least about 

65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 

about 90%, typically at least about 95%, more typically at least about 98%, or most typically at 

least about 99% amino acid identity) that retain biological activity. Polypeptides encoded by 

allelic variants may have a similar, increased, or decreased activity compared to polypeptides 

10 comprising SEQ ID NO: 1787-3572 and 5359-7144. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 
15 Chem. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as inimunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
20 sequence is identified in the sequence listing by translation of the disclosed nucleotide 

sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mate form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 
25 provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a bydrophilic, e.g^ pharmaceutically acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 
30 fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 
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A variety of methodologies known in the ait can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 
sequence can be synthesized using commercially available peptide synthesizers. The 
synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 
5 structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are useful, for 
example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 

10 therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 

15 which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 

20 culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 

25 culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 

30 methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography 5 HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and immuno-affinity chromatography. See, e.g. y Scopes, Protein Purification: Principles and 
Practice, Springer- Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 

35 Manual; Ausubel et aL, Current Protocols in Molecular Biology, Polypeptide fragments that 
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retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 
domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
5 the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
1 0 cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO: 1787-3572 and 5359-7144. 
1 5 The protein of the invention may also be expressed as a product of transgenic animals, 

e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
20 deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
25 molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 
30 systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
35 retain protein activity in whole or in part and are useful for screening or other immunological 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. 
(the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, NJ.) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g. , silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 
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The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 
Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 
modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
5 another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability- Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 
provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 
antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
10 as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 
steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

15 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 

20 programs including, but are not limited to, the GCG program package, including GAP 

(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al, J. Comp. 

25 Biol., Vol 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
105-31 (1982), incorporated herein by reference). The BLAST programs are publicly available 

30 from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 
BioL 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 

35 protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
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another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
5 portions of a protein according to the invention. Within the fusion protein, the term "operatively 
linked" is intended to indicate that the polypeptide according to the invention and the other 
polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 
C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 

1 0 the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 

1 5 the polypeptide sequences according to the invention comprises one or more domains are fused 
to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo. 

20 The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, e f g. f cancer as well as modulating {e.g., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 

25 to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 

30 appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 

35 subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 



WO 01/53312 PCT/US00/34263 

example, Ausubel et al. (eds.) Current PROTOCOLS IN Molecular Biology, John Wiley & 
Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
5 in- frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 

1 0 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 

1 5 Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1 998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 
American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachxomosomal substrates (transient expression) or 

20 artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 

25 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated KNA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 

30 inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
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the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
5 homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operatively linked to the desired protein encoding sequences. See, for example, PCT International 
PublicationNo. WO 94/12650, PCT International Publication No. WO 92/20808, and PCT 

10 International Publication No. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfir, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase,and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 

1 5 co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 

20 replace a gene' s existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 

25 protein produced may be replaced, removed, added, or otherwise modified by targeting. These 

sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 

30 under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element Alternatively, the 
targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 

3 5 occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
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added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 
of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 
xanthine-guanine phosphoribosyl-transferase(gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,07 1 to Chappel; 
U.S. Patent No. 5,578,461 to Sherwinet al.; International ApplicationNo. PCT/US92/09627 
(WO93/09222)by Seldenetal.; and International ApplicationNo. PCT/US90/06436 
(WO91/06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 
replacing the homologous promoter to provide for increased protein expression. The homologous 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
5 polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

10 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

1 5 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

20 Publication No. W094/28 1 22, incorporated herein by reference. 

Transgenic anirtials can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 

25 homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 



4.10 USES AND BIOLOGICAL ACTIVITY 

30 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

35 mechanism underlying the particular condition or pathology will dictate whether the 
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polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inhibitors) thereof would be beneficial to the subject in need of treatment- Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
5 polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 

1 0 indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 

15 or in one of the other physiological pathways described herein. 

4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 

20 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 

25 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 

30 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 

35 the binding interaction. 
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The polypeptides provided by the present invention can similarly be used in assays to 
-determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
5 receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

10 Any or all of these research utilities are capable of being developed into reagent grade or 

kit format for commercialization as research products. 

Methods for performing the uses listed above axe well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 

1 5 and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R, Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
20 sources or supplements. Such uses include without limitation use as a protein or amino acid 

supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
25 polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 



4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

30 A polypeptide of the present invention may exhibit activity relating to cytokine, cell 

proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 

35 or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
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confirmation of cytokine activity. The activity of therapeutic compositions of the present 
invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1 , BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, T1165, HT2, CTLL2, TF-1, Mo7e, CMK, 
HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 

Assays for T-cell or thymocyte proliferation include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et ah, J. Immunol. 137:3494-3500, 1 986; Bertagnolli et aL, J. Immunol. 
145:1706-1712, 1990; Bertagnolli et aL, Cellular Immunology 133:327-341, 1991; Bertagnolli, 
et aL, I. Immunol. 149:3778-3783, 1992; Bowman et aL, I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
Kruisbeek, A. M, and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. I E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
dcVries et aL, J. Exp. Med. 173:1205-1211, 1991; Moreau et aL, Nature 336:690-692, 1988; 
Greenberger et aL, Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 
and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 
1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et aL, Proc. Natl. Aced. Sci. 
U.S.A. 83:1 857-1861, 1986; Measurement of human Interleukin 1 1 --Bennett, F., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 
9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 
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Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 
1980; Weinberger et aL, Eur. J. Immun. 11:405-41 1, 1981; Takai et al., J. Immunol. 
5 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 

4 J0.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 

10 cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 
germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 

1 5 large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

20 for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Flt- 

25 3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 

inflammatory protein 1 -alpha (MIP-1 -alpha), G-CSF, GM-CSF, thTombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 

30 these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
35 with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 
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layer for the stem cell populations in culture or in vivo. Stromal support ceils for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
5 autocrine expression of the polypeptide of the invention- This will allow for generation of 

undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 

1 0 identification of differentially expressed genes in stem cell populations that regulate stem cell 
proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 

1 5 used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 

20 the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 

25 cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); Klug et al. 5 J. Clin. Invest, 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 

30 Academic Press (1 997)). Alternatively, directed differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 

35 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 

42 



WO 01/5331 2 PCT/US00/34263 
sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
5 proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et al., Blood, 77: 2316-2321 (1991). 



4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 

10 and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy 

15 to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 
treat consequent myelo-suppression; in supporting the growth and proliferation of 
megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

20 various platelet disorders such as thrombocytopenia, and generally for use in place of or 

complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation* aplastic anemia and 

25 paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

30 Suitable assays for proliferation and differentiation of various hematopoietic lines are 

cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et ah, Molecular 
35 and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, ML G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al., 
5 Proc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cells 
with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 
10 Wiley-Liss, Inc., New York, N.Y. 1 994; Long term bone marrow cultures in the presence of 
stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. L 
Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al 
eds. Vol pp, 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

15 

4.10,6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of bums, incisions and ulcers. 

20 A polypeptide of the present invention which induces cartilage and/or bone growth in 

circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 

25 artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 
useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 

30 bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 
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Another category of tissue regeneration activity that may involve the polypeptide of the 
present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 
other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 
5 humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing 
protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 
use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 
defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation induced by 
a composition of the present invention contributes to the repair of congenital, trauma induced, or 
10 other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 
provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 

1 5 tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 
an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 

The compositions of the present invention may also be useful for proliferation of neural 
cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

20 nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

25 lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 
composition of the invention. 

30 Compositions of the invention may also be useful to promote better or faster closure of 

non-heeding wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 

35 kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 



WO 01/53312 PCT/US00/34263 
endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 

desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 

to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 
5 regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 
1 0 Therapeutic compositions of the invention can be used in the following; 

Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 
1 5 Assays for wound healing activity include, without limitation, those described in: Winter, 

Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84(1978). 

20 4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 

25 severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 

proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 

30 treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp. ? malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 
Autoimmune disorders which may be treated using a protein of the present invention 

35 include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
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rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barxe syndrome, 
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens-Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
(particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 
1998), skin prick test (Hoffmann et aL, Allergy 54: 446-54, 1999), guinea pig skin sensitization 
test (Vohr et al, Arch. ToxocoL 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen~specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 
in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T ceils, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 
followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
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composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
5 of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 

1 0 rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al., Science 257:789-792 (1992) and Turka et al., Proc. Natl. Acad. Sci USA, 89: 1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 

1 5 compositions of the invention on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 

20 reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 

25 autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or N2B hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 

30 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 
35 infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 
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Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
5 patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

10 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor ceils. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

1 5 MHC class I alpha chain protein and 02 microglobulin protein or an MHC class II alpha chain 
protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 
proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1 , B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

20 an antisense construct which blocks expression of an MHC class II associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 

25 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

30 • Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al,, Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol 128:1968-1974, 1982; Handa et al., L 
Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al, I 
Immunol. 140:508-512, 1988; Bowman et al, J. Virology 61 :1992-1998; Bertagnolli et al., 

35 Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 



WO 01/53312 PCT/US00/34263 

Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
5 Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
10 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
1 5 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
173:549-559, 1991; Macatonia et al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 182:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
20 Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 
25 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
30 include, without limitation, those described in: Antica et al., Blood 84: 1 1 1-1 17, 1994; Fine et al., 
Cellular Immunology 155:1 11-122, 1994; Galy et al., Blood 85:2770-2778, 1995; Toki et al, 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

4.10.8 ACTIVIN/INHIBIN ACTIVITY 



50 



WO 01/53312 PCT7US00/34263 

A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
5 release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 

10 homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 

15 animals such as, but not limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
aL, Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et al., Nature 

20 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. ScL 
USA 83:3091-3095, 1986. 

4.10*9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemptactic or chemokinetic 
25 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
30 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
3 5 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
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Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 
5 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
10 M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12,1-6.12.28; Taubetal. I Clin. Invest. 95:1370-1376, 1995; Lind etal. APMIS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

15 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 

20 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

25 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474,1988. 

30 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
35 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 

52 ... ^ 
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may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 
inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 
cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 
associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 
nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 
administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as a 
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutical^ 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 
with the polypeptide or modulator of the invention include: Actinomycin D, Ammoglutethimide, 

53 
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Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cispiatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daunorubicin HC1, Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (V16-213), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfemide, 
5 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MIX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicatnycin, Procarbazine HC1, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelarnine, Interleukin-2, Mitoguazone, Pentostatin, 

1 0 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 

1 5 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 

20 tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 52: 921*30 

(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al-, Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 189-97 (1999) and Li et al., 

25 Clin. Exp. Metastasis, 1 7:423-9 (1 999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
30 receptor iigand or inhibitor or agonist of receptor/Iigand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
35 integrins and their, ligands) and receptor/Iigand pairs involved in antigen presentation, antigen 
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recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
5 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 

the following methods: 

Suitable assays for receptor-Iigand activity include without limitation those described in: 

Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
1 0 Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 

Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 

Natl. Acad Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1 145-1 156, 1988; 

Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 

175:59-68, 1994; Stitt et al., Ceil 80:661-670, 1995. 
15 By way of example, the polypeptides of the invention may be used as a receptor for a 

ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 

through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel 

overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
20 partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 

present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 

colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 

Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 

Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
25 carbon-14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 

molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 

toxins include, but are not limited, to ricin. 

4.10.13 DRUG SCREENING 

30 This invention is particularly useful for screening chemical compounds by using the 

novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 
utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 

35 nucleic acids expressing the polypeptide or a fragment thereof Drugs are screened against such 
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transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the 
diminution in complex formation between the novel polypeptides and an appropriate cell line, 
which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (I) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 282:63-68 (1998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 
organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin. 
Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol, 9(3):205-23 (1998); Hruby et aL, Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Dorner et al., BioorgMed Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hit" (or "lead**) to optimize the capacity of the "hit" to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
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molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 

5 4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 

10 to identify polynucleotides encoding binding partners. As another example, affinity 

chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (i.e., increase or decrease) biological activity of a polypeptide of the invention. 

1 5 Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 
invention whereas the other does not The response of the two cell populations to the addition of 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

20 polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic molecules. 

25 The role of downstream intracellular signaling molecules in the signaling cascade of the 

polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 

30 the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORY ACTIVITY 
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Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 
endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
themokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as INF or IL-1. Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 
limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 
intrauterine infections. 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblasts, promyelocyte, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et ai., 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
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disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
5 nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
10 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

1 5 tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 
sclerosis; 

20 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 

25 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

3 0 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 

35 system disorder may be selected by testing for biological activity in promoting the survival or 
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differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

5 (iii) increased production of a neuron-associated molecule in culture or in v/w» e.g. , 

choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

10 forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol 70:65-82) or Brown et aL 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc>, 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 

1 5 t assessing the physical manifestation of motor neuron disorder, e.g. , weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 

20 well as other components of the nervous system, as well as disorders that selectively affect 

neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 

25 (Charcot-Marie-Tooth Disease). 



4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 

3 0 including, without limitation, bacteria, viruses, fimgi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 
effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 

35 subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
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elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or component(s); effecting behavioral characteristics, including, without 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
5 reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 
as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
10 in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
1 5 polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
20 polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 

25 involving isolation or amplification of the DNA, and identifying the presence of the 

polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 
allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 
hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 

30 single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 

adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 
enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 

35 present invention can be used to detect polymorphisms. The array can comprise modified 
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nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
5 also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 

4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
10 arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
Induction of the disease can be caused by a single injection, generally intradermally, of a 
suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
15 route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 

mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of intradermally 
injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
20 test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

25 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
30 include, but are not limited to, those exemplified herein. 

4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
35 disorder that can be modulated by regulating the peptides of the invention. While the mode of 
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administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
condition and response of the individual patient. Typically, the amount of polypeptide 
administered per dose will be in the range of about O.Oljig/kg to 100 mg/kg of body weight, with 
the preferred dose being about 0. 1 jig/kg to 1 0 mg/kg of patient body weight. For parenteral 
administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art. 

4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 
to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 
fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 
effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 
M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, EL-9, IL-10, IL-1 1, IL-12, 
IL-13, IL-14, IL-15, IFN, TNFO, TOF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 
factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 
factor (PDGF), transforming growth factors (TGF-ct and TGF-J}), insulin-like growth factor 
(IGF), as well as cytokines described herein. 
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The pharmaceutical composition may further contain other agents which either enhance 
the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 
composition to produce a synergistic effect with protein or other active ingredient of the 
5 invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 
hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 

10 IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 

As an alternative to being included in a pharmaceutical composition of the invention 

1 5 including a first protein, a second protein or a therapeutic agent may be concurrently 

administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 

20 edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 

25 combination, a therapeutically effective dose refers to combined amounts of the active 

ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

In practicing the method of treatment or use of the present invention, a therapeutically 
effective amount of protein or other active ingredient of the present invention is administered to 

30 a mammal having a condition to be treated. Protein or other active ingredient of the present 

invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, iymphokines or other 
hematopoietic factors. When co- administered with one or more cytokines, Iymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 

35 administered either simultaneously with the cytokine(s), iymphokine(s), other hematopoietic 
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factor(s), thrombolytic or antithrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 
active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factor(s), thrombolytic or antithrombotic factors. 

5 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 

10 intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

15 Alternately, one may administer the compound in a local rather than systemic manner, for 

example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 

20 system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
25 an effective dosage for a particular indication is within the level of skill in the art Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician. to provide maximal therapeutic benefit. 

30 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
35 preparations which can be used pharmaceutically. These pharmaceutical compositions may be 
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manufactured in a manner that is itself known, e.g., by means of conventional mixing, 
dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 
lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
5 invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 
the pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 

10 other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 
soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 

15 When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 

20 other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 

25 present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 

30 preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the 
35 active compounds with pharmaceutically acceptable carriers well known in the art. Such carriers 
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enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
5 suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 

10 may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 

1 5 added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 

20 lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 

optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 

25 tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g. , 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 

30 other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 

providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in 
an inhaler or insufflator may be formulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 

3 5 injection may be presented in unit dosage form, e.g. , in ampules or in multi-dose containers, with 
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an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble fonn. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1 ;1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
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known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
5 Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

10 The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

or excipients. Examples of such carriers or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 
polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically 

1 5 acceptable base addition salts are those salts which retain the biological effectiveness and 

properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 
monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
the like. 

20 The pharmaceutical composition of the invention may be in the form of a complex of the 

protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 

25 presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatoiy molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 

30 well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 

35 micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
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lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient. Initially, the 
attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 \xg to about 100 mg (preferably about 0.1 jug to about 10 mg, more preferably 
about 0.1 \ig to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic, 
composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredient-containing composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 
capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
compositions will define the appropriate formulation. Potential matrices for the compositions 
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may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
5 matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
10 biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

glycolic acid in the form of porous particles having diameters ranging from 1 50 to 800 microns. 
In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

15 A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 

(including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 
hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 

20 poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and polyvinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 

25 protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-p), and 

30 insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 

35 regeneration will be determined by the attending physician considering various factors which 
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modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 
other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.12.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a . therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 
appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieves circulating 
concentration range that includes the IC50 as determined in cell culture (i.e., the concentration of 
the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g., for determining the LD 50 (the dose lethal to 50% of the 
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population) and the ED 50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD 50 and ED 50 . Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
range of dosage for use in human. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the ED 50 with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et al., 1975, in 'The 
Pharmacological Basis of Therapeutics", Ch. 1 p.L Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which are sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
bioassays can be used to determine plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local, 
administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the invention 
will be in the range of about 0,01 jig/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 |ng/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
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invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 



4J3 ANTIBODIES 

5 Also included in the invention are antibodies to proteins, or fragments of proteins of the 

invention. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F a t» F a b' and 

10 fragments, and an F a b expression library. In general, an antibody molecule obtained from 

humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgGi, IgG 2 , and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 

1 5 subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 

20 invention provides antigenic peptide fragments of the antigen for use as immunogens. An 

antigenic peptide fragment comprises at least 6 amino, acid residues of the amino acid sequence 
of the full length protein, such as an amino acid sequence shown in SEQ ID NO: 1787, and 
encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 

25 Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 1 5 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 

30 antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 

35 may be generated by any method well known in the art, including, for example, the Kyte 
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Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 
Hopp and Woods, 1981, Proa Nat Acad Set USA 78: 3824-3828; Kyte and Doolittle 1982, J. 
Mol Biol. 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
monoclonal antibodies directed against a protein of the invention, or against derivatives, 1 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

5.13.1 Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 
recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 
adjuvant. Various adjuvants used to increase the immunological response include, but are not 
limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 
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target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

5*13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature, 256:495 (1975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 
protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 
are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: 
Principles and Practice, Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 
Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 
the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
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California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J. Immunol , 133:3001 (1984); Brodeur et aL, Monoclonal 
Antibody Production Techniques and Applications. Marcel Dekker, Inc., New York, (1987) pp. 
5 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
1 0 enzyme-linked inimunoabsorbent assay (ELIS A). Such techniques and assays are known in the 
art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal. Biochem., 107:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

1 5 After the desired hybridoma cells are identified, the clones can be subcloned by limiting 

dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1 640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 

20 medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 
example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 

25 invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 

30 myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,8 1 6,567; Morrison, Nature 368 , 
812-13 (1 994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 

35 coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
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polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13.2 Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further comprise 
humanized antibodies ox human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab f , F(ab')2 or other antigen- 
binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et aL, 
Nature , 321 :522-525 (1986); Riechmann et aL, Nature . 332:323-327 (1988); Verhoeyen et aL, 
Science, 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody will comprise substantially all of at least one, and typically two, variable 
domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et aL, 1986; Riechmann et aL, 1988; and Presta, Curr. Op. Struct Biol.. 
2:593-596 (1992)). 

5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et aL, 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et aL, 1985 In: Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
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antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et aL, 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 
including phage display libraries (Hoogenboom and Winter, J. MoL Biol., 227:381 (1991); 
Marks et aL, J. MoL Biol.. 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in humans 
in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10. 779-783 (1992)); Lonberg et al. 
(Nature 368 856-859 (1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild et al,( Nature 
Biotechnology 14, 845-51 (1996)); Neubereer (Nature Biotechnology 14. 826 (1996)); and 
Lonberg and Huszar (Intern. Rev. Immunol. 13 65-93 (1 995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 
have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 
preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 
polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 
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An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 
expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 
5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
5 locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 
and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 

10 U.S. Patent No. 5,916,771. It includes introducing an expression vector that contains a 

nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 
an expression vector containing a nucleotide sequence encoding a light chain into another 
mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 

15 In a further improvement on this procedure, a method for identifying a clinically relevant 

epitope on an immunogen, and a correlative method for selecting an antibody that binds 
immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

20 5.13.4 Fab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of F a b expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 

25 monoclonal F a b fragments with the desired specificity for a protein or derivatives, fragments, 

analogs or homologs thereof Antibody fragments that contain the idiotypes to a protein antigen 
may be produced by techniques known in the art including, but not limited to: (i) an F( a b<>2 
fragment produced by pepsin digestion of an antibody molecule; (ii) an F a b fragment generated 
by reducing the disulfide bridges of an F( ab ')2 fragment; (iii) an F a b fragment generated by the 

30 treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

5.13.5 Bispeciftc Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
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binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 
5 immunoglobulin heavy-chainyiight-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 
potential mixture often different antibody molecules, of which only one has the correct 
bispecific structure. The purification of the correct molecule is usually accomplished by affinity 

10 chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and mTrauneckere/a/., 1991 EMBOJ., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 

1 5 the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 

20 aL, Methods in Enzvmology. 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 
recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 

25 chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

30 Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 

F(ab') 2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 

35 fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
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stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
derivatives is then reconverted to the Fab '-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
5 antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et aL, J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab 9 >2 molecule. Each Fab 5 fragment 

1 0 was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 

1 5 recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny et aL, J. Immunol. 148(5): 1547-1 553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 

20 also be utilized for the production of antibody homodimers. The "diabody" technology 

described by Hollinger et aL, Proc. Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 
heavy-chain variable domain (Vn) connected to a light-chain variable domain (Vl) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 

25 the V H and V L domains of one fragment are forced to pair with the complementary Vl and Vh 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, Gruber et aL, J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 

30 antibodies can be prepared Tutt et aL, J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 
immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 

35 IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRHI (CD1 6) so as to focus cellular 
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defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 
possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
5 binds the protein antigen described herein and further binds tissue factor (TF). 

5.13.6 Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 

1 0 have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by forming a thioether bond. 

1 5 Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
20 to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 1 76: 1 1 91 -1 1 95 (1 992) 
25 and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

30 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
3 5 radioconj ugate) . 
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Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzyrnatically active toxins and fragments thereof that can be used include 
diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
5 Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enornycin, and the tricothecenes. A variety of 
radionuclides are available for the production of radioconjugated antibodies. Examples include 
2I2 Bi, 131 I 5 I3I In, 90 Y,and I86 Re. 
10 Conjugates of the antibody and cytotoxic agent are made using a variety of Afunctional 

protein-coupling agents such as N-succinimidyl-3^(2~pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), Afunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 
1 5 bis-(p-dia2oniumbenzoyl>ethyIenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxm can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 
Carbon- 14-labeled l-isothiocyanatobenzyI-3-methyIdiethyIene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
20 WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
25 conjugated to a cytotoxic agent. 

4,14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 

30 any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 

35 be used to create a manufacture comprising computer readable medium having recorded thereon 
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a nucleotide sequence of the present invention. As used herein, "recorded 1 * refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 
5 A variety of data storage structures are available to a skilled artisan for creating a 

computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 

10 readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats (e.g. text file or database) in order to obtain computer readable medium having recorded 

1 5 thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO:l-1786 and 3573-5358 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of the 
nucleotide sequences of SEQ ID NO:l-1786and 3573-5358 in computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. Computer 

20 software is publicly available which allows a skilled artisan to access sequence information 
provided in a computer readable medium. The examples which follow demonstrate how 
software which implements the BLAST (Altschul et ah, J. Mol. Biol. 215:403-410 (1990)) and 
BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system 
is used to identify open reading frames (ORFs) within a nucleic acid sequence. Such ORFs may 

25 be protein encoding fragments and may be useful in producing commercially important proteins 
such as enzymes used in fermentation reactions and in the production of commercially useful 
metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 

30 present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 
computer-based systems of the present invention comprise a data storage means having stored 

35 therein a nucleotide sequence of the present invention and the necessary hardware means and 
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software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 
5 As used herein, "search means" refers to one or more programs which are implemented 

on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 

1 0 available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith- Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present 

15 computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 

20 residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
25 three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

30 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
35 Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
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designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241:456 (1988); and Dervan 
et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 
detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
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probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1986); Bullock, G.R. et al., Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. I (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 
extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 
provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 

4.17 MEDICAL IMAGING 
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The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO:l- 
1786 and 3573-53 58, or bind to a specific domain of the polypeptide encoded by the nucleic 
acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORP of the present 
invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 
the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 
polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 
polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 
comprise contacting a compound with a polypeptide of the invention in a cell for a time 
sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
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activity observed in the absence of the compound). Alternatively, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 
the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 
readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et 
al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1 991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
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from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention can 
be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 
present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 

4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO: 1-1786 and 3573-5358. Because the corresponding gene is only 
expressed in a limited number of tissues, a hybridization probe derived from of any of the 
nucleotide sequences SEQ ID NO:l-1786 and 3573-5358 can be used as an indicator of the 
presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 
additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PCR may be of recombinant origin, may be chemically synthesized, or a rhixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize RNFA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 
nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
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chromosome spreads has been described, among other places, in Verma et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
of genetic map data can be found in the 1994 Genome Issue of Science (265:1981f). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 

4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesiser. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 
skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1 990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagata et al, 1985; Dahlen et al, 1987; Morrissey & Collins, (1989) Mol. Cell 
Probes 3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et al, 1 988; 1 989); all 
references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interactionas a linker. For example, Bxoudeet al (1994) Proc. Natl. Acad Sci. USA 91(8) 3072-6, 
describe the use of biotiny lated probes, although these are duplex probes, that are immobilized on 
streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 
Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 
Biotiny lated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, JL) is also selling suitable material that could be used. Nunc 
Laboratories have developed a method by which DNA can be covalently bound to the microwell 
surface termed Covalink NH. CovaLink NH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5'-end by a phosphoramidatebond, allowing immobilization of more than 1 pmol of DNA 
(Rasmussenef a/., (1 991) Anal. Biochem. 198(1) 138-42). 
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The use of CovaLink NH strips for covalent binding of DNA molecules at the S'-end has 
been described (Rasmussenet al, (1991). In this technology, a phosphoramidatebond is employed 
(Chu et al., (1 983) Nucleic Acids Res. 1 1(8) 65 1 3-29). This is beneficial as immobilization using 
only a single covalent bond is preferred. The phosphoramidatebond joins the DNA to the 
5 CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLink NH via an phosphorajmidate bond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidin used to bind the probes. 
1 0 More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 

denaturing for 1 0 min. at 95°C and cooling on ice for 1 0 min. Ice-cold 0. 1 M 1 -methylimidazole, 
pH 7.0 (1-Melm 7 ), is then added to a final concentration of 10 mM 1-Melm 7 . A ss DNA solution is 
then dispensed into CovaLink NH strips (75 ul/well) standing on ice. 

Carbodiimide0.2 M l-ethyl-3<3-dimethylaminopropyl)-carbodiimide(EDC), dissolved in 
15 10 mM 1 -Melm 7 , is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is that 
20 described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3*-reagent through the phosphate group by a covalent phosphodiester link to aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 
25 conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 
30 FodoretaL (1991) Science 25 1(4995) 767-73, incorporated herein by reference. Probesmay also 
be immobilized on nylon supports as described by Van Ness et al (1 99 1 ) Nucleic Acids Res. 
1 9(1 2) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1 988) Anal. Biochem. 
169(1) 104-8; all references being specifically incorporated herein. 
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To link an oligonucleotide to a nylon support, as described by Van Ness et al (1 99 1), 
requires activation of the nylon surface via alkylation and selective activation of the 5'-amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
light^generated synthesis described by Pease et al t (1 994) PNAS USA 91(11) 5022-6, incorporated 
herein by reference). These authors used current photolithographictechniques to generate arrays of 
immobilized oligonucleotideprobes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotideprobes in high-density, miniaturized arrays, utilize photolabile 
S'-protected A^-acyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 
combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotideprobes may be 
generated in this manner. 

4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
including mRNA without any amplification steps. For example, Sambrook et al (1 989) describes 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in Ml 3, plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplification methods. Samples 
may be prepared or dispensed in multiwell plates. About 1 00-1000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook et 
al. (1 989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as described by Schriefer et al (1 990) Nucleic 
Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 
fragmentationmethods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, Cv/JI, described by Fitzgerald et al (1992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
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of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease CvzJI normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
this enzyme (CvzJI**), yield a quasi-random distribution of DNA fragments form the small 
molecule pUCl 9 (2688 base pairs). Fitzgerald et al (1 992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a Cv/JI** digest of pUC19 that was size 
fractionatedby a rapid gel filtration method and directly Iigated, without end repair, to a lac Z minus 
M13 cloning vector. Sequence analysis of 76 clones showed that Cv/JI** restricts pyGCPy and 
PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
quickly to 2°C to prevent renaturation of the DNA fragments before they are contacted with the 
chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA solution to a 
nylon membrane. By offset printing, a density of dots higher than the density of the wells is 
achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 
may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 
subarrays may represent replica spotting of the same samples. In one example, a selected gene 
segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. 
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Subarrays may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 
5 being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 
1 0 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
1 5 variations in the practice of the invention are expected to occur to those skilled in the art upon 
considerationof the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby incorporated by 
reference in their entirety. 

20 5.0 EXAMPLES 

5.1.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 

A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 

25 using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 

inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened 
with oligonucleotideprobes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
into groups of similar or identical sequences. Representative clones were selected for sequencing. 

5 0 In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 

Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDNA Ends) was performed to further extend the sequence in the 5' direction. 
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5.1.2 EXAMPLE 2 
Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 3573-5358 
were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
5 the seed EST into an extended assemblage, by pulling additional sequences from different databases 
(i.e., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 1 14, and UniGene 
version 101) that belong to this assemblage. The algorithm terminated when there was no 
additional sequences from the above databases that would extend the assemblage. Inclusion of 
component sequences into the assemblage was based on a BL ASTN hit to the extending assemblage 
1 0 with BLAST score greater than 300 and percent identity greater than 95%. 

A polypeptide was predicted to be encoded by each of SEQ ID NO:3573-5358 as set forth 
below. The polypeptides was predicted using a software program called FASTY (available from 
http://fasta.bioch.virginia.edu^ which selects a polypeptides based on a comparison of translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 183:63-98 
1 5 (1990), herein incorporated by reference. The predicted polypeptides are shown in Table 7. 

5.2,2 EXAMPLE 3 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 

20 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 
ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1 - 327. 

25 Table 1 shows the various tissue sources of SEQ ID NO: 1-327. 

The nearest neighbor results for SEQ ID NO: 1-327 were obtained by a FASTA version 3 
search against Genpept release 117, using FASTXY algorithm. FASTXY is an improved 
version of FASTA alignment which allows in-codon frame shifts. The nearest neighbor result 
showed the closest homologue for SEQ ID NO: 1-327 from Genpept . The translated amino acid 

30 sequences for which the nucleic acid sequence encodes are shown in the Sequence Listing. The 
nearest neighbor results for SEQ ID NO: 1-327 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
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signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. * 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VL1 program (from 

10 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incoiporated herein by 

1 5 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.3.2 EXAMPLE 4 

20 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 17, gb pri 1 1 7, 

25 UniGene version 1 1 7, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 328-141 3. 
Table 1 shows the various tissue sources of SEQ ID NO : 328-1413. 

30 The nearest neighbor results for SEQ ID NO: 328-1413 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 1 1 8, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 328-1413 from Genpept. 
The translated amino acid sequences for which the nucleic acid sequence encodes are shown in 
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the Sequence Listing. The nearest neighbor results for SEQ ID NO: 328-1413 are shown in 
Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
5 examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
10 - examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 

1 5 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaiyotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

20 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences.- Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 



25 5.3,2 EXAMPLES 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
30 checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 7, gb pri 11 7, 

UniGene version 1 17, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1414-1652. 
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Table 1 shows the various tissue sources of SEQ ID NO: 1414-1652. 
The nearest neighbor results for SEQ ID NO: 1414-1652 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 118, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 1414-1652 from 
5 Genpept. The translated amino acid sequences for which the nucleic acid sequence encodes are 
shown in the Sequence Listing. The nearest neighbor results for SEQ ID NO: 1414-1652 are 
shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 
Biol., Vol. 6 pp, 219-235 (1999) herein incorporated by reference), all the sequences were 
10 -examined to determine whether they had identifiable signature regions. Table 3 shows the 

signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
1 5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 

20 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

25 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.4.2 EXAMPLE 6 
30 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a foil length gene cDNA 
sequence and its correspondingprotein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 8, gb pri 1 1 8, 
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UniGene version 1 1 8, Genpept release 1 1 8). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The ful 1-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS : 1 653-1 745 . 
Table 1 shows the various tissue sources of SEQ ID NO: 1653-1745. 
The homology for SEQ ID NO: 1653-1745 were obtained by a BLASTP version 2.0al 
19MP-WashU sesxch against Genpept release 1 1 8, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1653-1745 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
with identifiable functions for SEQ ID NO: 1653-1745 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaiyotic signal peptides and their cleavage sites axe also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaiyotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.5.2 EXAMPLE 7 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any frame 

shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 

checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 9, gb pri 1 1 9, 

UniGene version 1 1 9, Genpept release 1 1 9). Other computer programs which may have been used 

in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 

ext and gc-zip-2 (Hyseq, Inc.). TTie full-length nucleotide, including splice variants resulting from 

these procedures are shown in the Sequence Listing as SEQ ID NOS: 1 746-1768. 

Table 1 shows the various tissue sources of SEQ ID NO: 1746-1768. 

The homology for SEQ ID NO: 1746-1768 were obtained by a BLASTP version 2.0al 
1 9MP-WashU search against Genpept release 1 1 9, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1746-1768 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
with identifiable functions for SEQ ID NO: 1746-1768 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 
Biol., Vol. 6 pp. 21 9-235 (1999) herein incoiporatedby reference), all the sequences were examined 
to determine whether they had identifiable signature regions. Table 3 shows the signature region 
found in the indicated polypeptide sequences, the description of the signature, the eMatrix p- 
value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the PFam software program (Sonnhammer et al. 9 Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the PFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication " 
Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol. 10, no. I, pp. 1-6 (1997) s incorporated herein by reference. A maximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 5 shows the position of the signal peptide in each of the polypeptides 
and the maximum score and mean score associated with that signal peptide. 
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5.6.2 EXAMPLE 8 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FAST Y and/or BLAST against Genbank (i.e., dbEST version 120 5 gbpri 120, 
UniGene version 1 20, Genpept release 120). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.) . The translated amino acid sequences for which the nucleic acid 
sequence encodes are shown in the Sequence Listing. The full-length nucleotide, including splice 
variants resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1769- 
1786. 

Table 1 shows the various tissue sources of SEQ ID NO: 1 769-1 786. 

The homology for SEQ ID NO: 1769-1786 were obtained by a BLASTP version 2.0al 
19MP-WashU search against Genpept release 120 and the amino acid version of Geneseq 
released on October 26, 2000, using BLAST algorithm. The results showed homologues for 
SEQ ID NO: 1769-1786 from Genpept. The homologues with identifiable functions for SEQ ID 
NO: 1769-1786 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., X Cornp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by HenrikNielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
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reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 

each of the polypeptides and the maximum score and mean score associated with that signal 

peptide. 

Table 6 is a correlation table of all of the sequences and the SEQ ID NOS. 
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TABLE 1 



Tissue Origin - 



adult brain 



RNA Source 



GIBCO 



Hyseq 
I*ibrary Name 



AB3001 



SEQ 10 NOS : 



~9 19-21 50-51 65-66 72 78 80 82' 
85 87 107-108 113 116 123 138 
140 150-152 159 169 177 192-193 
202-203 212-214 225-226 235-236 
251 258 268-269 272 290-281 295 
298 301 321 326 331-332 334 356- 
357 362 369 379 382-383 416 423 
443 459-460 473 475 477 488 496 
500 503 519 526 547 574 582 587 
608-609 613 618 633-634 645-646 
652 657-658 660 669-671 678 687 
695 697 710 715 724 731 775-777 
796 804 811 857-859 862 869 899- 
900 912 919 922 924-929 933 936 
962 979 988-989 996 1001 1004- 
1008 1018 1039 1047 1059 1064 
1067 1070 1078 1082 1107 1113 
1116-1117 1131 1134-1137 1140 
1149 1151 1157 1180 1206 1229 
1234 1241 1243 1258 1272-1273 
1279 1288-1290 1294 1307-1308 
1312 1320 1323 1330 1356 1360- 
1361 1368 1373-1375 1379 1391 
1400 1417 1446 1468 1482 1493- 
1494 1501-1503 1506-1507 1512 
1517 1522-1S24 1530-1533 1537 
1549 1565 1578 1598 1606 1608 
1G23 1625 1627 1639 1643 1648- 
1649 1653 1664 1667 1671 1696 
1734 1741 1743-1744 1760-1761 
1771 



GIBCO 



ABD003 



3 12-14 18-19 25 30-31 34-36 43~ 
45 50-51 56 58 60 65-66 68-69 80 
82 85 87 92 104 107-108 112-113 
115-116 123-124 131-132 135-137 
139 142 146 148-149 152 154 157 
159 163 165 167 169 172 180 192- 
193 196-197 199 203 208 210 212- 
214 223 233 235-237 247 257 259 
261 268-269 272 276 280-281 284- 
288 291-292 295 297 300-301 304 
307 317 320-321 323 327 329-331 
333-334 345-349 356-357 379-381 
393 401 408 414 419 424 426-428 
430 433-436 438-439 443 445 449 
453-454 459-461 468 471-473 476- 
478 483 491 494 496 500 503 507- 
508 516 519-520 525-527 534 536- 
540 542-543 545 553 555 560 569- 
570 574-576 586-58B 593 595 597 
€01 606-609 616-620 622-623 625 
628-633 635-636 643 645-649 653 
655-6S6 660-665 668-670 676 681 
687 701 710 71S 717 724-728 735 
743 745-746 750 753 759 765-766 
773 775-778 786 789 796 799-800 
802-803 810-811 815 817 820-821 
832 834-836 840 845-847 851 858- 
861 864 869 874 878 883 897 901- 
902 904-905 908 911-914 916 921- 
922 924-927 929 932-934 936-939 
941-942 945 955-958 963 966-969 
977 979-980 985-986 990 992-993 
997-1001 1005-1007 1012 1017- 
1020 1023-1024 1029-1031 1034 
1036 1039 1050 1059 1063-1066 
1078 1081-1082 1085-1086 1089 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



aault brain 



Clontech 



adult brain 



ABR001 



1097 1103 1107 1105 1112 1116- 
1117 1119 1121 1124 1127 1130 
1134 1144-1145 1149 1151 1157- 
1158 1167 1170 1178 1184 1188 
1190 1193-1194 1200 1202 121S- 
1217 1220 1226-1227 1229 1231 
1241 1243 1247 1252 1258 1263 
1267 1269 1279 1281 1284 1286- 
1289 1293-1294 1306-1307 1312 
1316-1320 1326 1333 1338 1341 
1344 1348 1351 1355-1357 1368 
1374 1377 1380 1386 1389-1390 
1394 1400 1409 1414 1422-1423 
1425-1427 1437 1443 1446 1454 
1456 1458-1459 1468 1470-1472 
1478 1482-1483 1487-1488 1493 
1497 1499 1506 1508-1511 1517 
1522-1524 1530-1533 1545-1546 
1548-1550 1552 1557 -1559 ■ 1563 
1565 1567 1569 1571 1S86 1588 
1591 1593 1595 1598-1601 1608 
1611 1620-1621 1624-1626 1628 
1630-1632 1636 1640-1641 1644- 
1645 1647 1649 1653-1655 1657 
1664 1667 1669 1673 1678-1681 
1686 1690 1694-1696 1701 1709 
1711 1719 1722-1723 1726-1727 
1731-1733 1738 1740 1743-1744 
1747 1749 1753 1757-1758 1760- 
1761 1765 1771 1785 



9 29 68-69 113 115 146 152 206 
223 245 277 307 320 324 330-331 
344 348 352 362 379 384 393 404 
408 414 441-442 454 469 481 490 
506 517 586 597 631 641 659 691 
715 799 003 833 865 871 875 880 
882 908 920 937 1000 1005-1005 
1027 1036 1041 1043 1075 1107 
1112 1121 1127 1136-1137 1144- 
1147 1231 1238-1239 1280 1293 
1320 1345 1355 1361 1383-1384 
1400 1417 1448 1456 1476 1507 
1570 1572 1609-1610 1614 1620 
1626 1645 1653 1754 1759 1770 
1786 



Clontech 



ABR006 



adult brain 



Clontech 



ABROOB 



5-8 15-16 168 212-213 271 278" 

280-281 291-292 300-301 310 314 
321 326 336-338 341 352 357 359- 
360 362 369 374 379 384 393 396- 
397 414 419-420 426-428 430 441- 
442 453 506 616-617 661 689 785 
798 845 1018 1109 1113 1124 1148 
1167 1187 1207 1227 1252 1265 
1385 1312 1317-1319 1324-1327 
1344 1369 1381 1400 1416 1421 
1427 1430-1431 1436 1471 1501 
1557-1559 1586 1588 1651 1653 
1664-1655 1671 1673 1690 1697- 
1698 1700 1711 1717 1719-1720 
1728 1736 1740 1743-1744 1757 
1760-1751 



5-10 13-19 22-23 25 29 33 37-39 
43-45 50-51 54-55 57-58 60-66 
68-70 72 75 77-80 83 85 89-92 94 
99-105 108-110 112-113 116-117 
123 128 133 135-137 139 143 145- 
146 148 152 154-155 157 166 168- 
172 174-175 181-184 188-190 193- 
194 196 198-200 202 204-205 207- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



S2Q ID NOS: 



208 210 2X4-21S 218 221-226 229 
231-232 234-241 24S-247 251-253 
255 257-259 268-269 271 276-281 
285-286 288 290-292 300-302 304 
307 309^321 313 315 317-318 320- 
322 325-326 328 330-331 333-338 
341 344-347 349 352 354 356-357 
362 369-373 376 379-380 382 384 
387 390-391 393-394 397 399-403 
405-411 414-415 417-420 426-428 
437-438 440-444 453-455 462 464 
467 469-471 476 478 432-484 488- 
491 497 503 506-513 516-517 520 
524-526 528-530 532-534 537-540 
542 544 547-551 553 561 565-567 
572-574 577 S81 585 587-588 590- 
591 597 599 601-602 606-610 612 
615-617 619-620 622-623 628-629 
631 633-634 636-641 643 64S-647 
651-653 655-664 669-671 673 679 
682 687 689 691-700 702 706 710 
715-717 720-721 725-734 736-739 
742-743 746 750-752 756 758-759 
762-764 766 768 773-778 780-782 
734-785 787-789 794 796 799 802- 
803 805 811 814-815 818 825-826 
834-837 839-840 842-843 856-859 
861-862 865 867-872 874-875 B81 
883-884.887 889-892 894-895 897- 
898 901 904 908 910 912 914 917 
919 921-924 926-927 930-932 935- 
941 943 945 949 953-954 958 961- 
963 967 969 971 975 977 981-983 
986 988-990 992 997 999-1002 
1004-1006 1008 1012 1018-1023 
1027 1029-1031 1035-1037 1047- 
1048 1053 10S7 1059 1063 1068 
1070 1072-1075 1077 1081-1083 
108S-1093 1095-1096 1108-1112 
1114-1125 1127 1131-1133 1135- 
1138 1142-1145 1148-1158 1160- 
1163 1167 1169 1172 1175 1177 
1180 1183-1188 1191-1195 1199- 
1200 1204 1206 1211 1213-1216 
1222-1223 1226-1227 1229-1231 
1234-1235 1241-1242 1244-1263 
1266 1269-1271 1276-1277 1279- 
1281 1284-1286 1292 1294-1295 
1299 1305-1309 1312 1314 1316- 
1319 1322 1324-1327 1330 1332 
1334-1335 1339 1344-1346 1351 
1354-1355 1357-1358 1365-1367 
1369-1370 1373-1374 1376-1379 
1381-1384 1386-1388 1392 1394 
1396-1397 1400 1403-1407 1410 
1414 1419-1420 1423 1432-1433 
1435 1437-1438 1440-1442 1446 
1448 1453-1455 1457 1461 1463- 
1464 1466 1468 1471 1477 1480 
1482-1483 1496 1502-1504 1507- 
1509 1513 1519-1520 1524-1526 
1536 1547 1549-1552 1567 1573- 
1574 1578 1586-1589 1597-1598 
1601-1602 1605 1607-1609 1611- 
1617 1619-1621 1623 1625-1626 
1635-1641 1643-1645 1649 1651 
1653 1656-1658 1664 1659 1671- 
1674 1676-1684 1686 1689-1690 
1694-1696 1704-1705 1708-1709 
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Tissue Origin I rna Source 



adult brain 



adult brain 
adult brain 



Clontech 



BioChain 



Hyseq 
Library Name 



ABR011 



ABR012 



Invitrogen 



ABR013 



SEQ ID NOS: 



1720-1724 1726-1728 1730-1733 
1737-1740 1742-1745 1753 1756- 
1757 1759-1761 1765 1767 1771- 
1772 1776-1777 1779-1780 1786 



24 75 103 186 210 310-311 364- 
365 508 623 710 937 1002-1003 
10S9 1204 1609 1731-1732 



46 182-184 204-205 300 733 767 
1371 1549 1620 1684 



185 204-205 364-365 393 497 595' 
687 692-694 830 845 1068 1320 
1413 1640 



adult brain 



Invitrogen 



ABR014 



adult brain 
adult brain 



187 301 357 364-365 375 454 463 
731 859 939 983 1073 1262 1270 
1320 1403 1640 1651 1657 1696 
1722 1738 



Invitrogen 



ABR015 



419 434-435 441-442 763 789 983 
1320 



Invitrogen 



ABR016 



312 364-365 379 1320 1334-1335 
1674 1722 1785 



adult brain 



Invitrpgen 



ABT004 



cultured 
pre adipocytes 



Strategene 



14-16 22-23 25 37-39 43 58 60 
70-72 78 86 94 107 113 116 136- 
137 143 146 152 161 173 1B2-184 
194 196 198 210 218 229 259 267 
295 298 309-310 320-321 324 336- 
338 346-347 349-350 356-357 362 
371 379-380 382-383 391 393 396 
399 401 408 428 438 459 461 476 
482 490 502 507-509 516 526 531 
557 562 597 602 607-609 624 652 
655 667 669 671-672 687-689 695- 
696 710 712 715 721 732 739 743 
750 753 766 778 780-781 789 803 
814 826 830 837 841 857 869 874 
894-895 925 937 949 954-956 96D- 
951 963 968-969 988-989 1000 
1005-1006 1016-1019 1021 1036- 
1037 1052 1086 1090 1109 1113 
1115 1120-1121 1123-1124 1136- 
1137 1140 1144-1147 1151 1167 
1170 1174 118B 1193-1194 1205 
1225 1229 1231 1254 12S8 1262 
1280 1285 1309 1312 1334-1335 
1341 1343-1344 1356-1357 1370 
1378-1379 1383-1384 1403-1404 
1423 1429 1434 1442 1448 1451- 
1452 1454 1470-1472 1482 1499 
1525 1528-1529 1532 1536 1547 
1554 15S7-1559 15S1-1S62 1567 
1585 1588 1590 1595 1601-1604 
1608 1610-1613 1615 1619 1624 
1627 1640 1644 1647 1660 1664 
1666 1670 1675 1696 1704 1715 
1723 1727 1738 1760-1761 1768 
1779 1785-1786 



ADP001 



5-8 11 17 25 68-69 80 82 87 103 
105 110 116 136-138 168 171 188- 
189 196-198 261 267 276 288 293 
301 318 331 336-338 379-380 391 
400 428 430-431 510-512 520 524 
527 549 557 561 602 618 620 622 
631 637 647 670 681-682 710 731 
748 782 793-794 817 834-836 843 
845 858-859 879 882 893-895 934 
960 982 986 995-996 1000 1002 
1005-1007 1025 1027-1028 1032 
1039 1045 1071 1078 1097 1099- 
1102 1136-1137 1140 1219-1220 
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Tissue Origin 



RNA Sour c ft 



Hyseq 
Library Name 



SEQ ID NOS: 



adrenal gland 



1260 1271 1297-1298 1314 1320 
1322 1329 1339 1345 1365-1366 
1370-1371 1398 1408 1423 1431 
1437 1466 1468 1533 1539 1594 
1602 1608 1614 1631 ,1649-1650 
1660 1662 1673 1687-1688 1696 
1711 1719-1720 1742 1746 1749 
1760-1761 1765 1767 1771 1785 



Clontech 



ADR002 



adult heart 



4-10 15-16 25 29-31 43-4$ 47 50- 
51 55 60 62-63 65-66 75 80 102 
116 118 122 126 130 137 150 169- 
170 181 192 198 201-203 215 227- 
228 247 251 255 267-269 271 280- 
281 285 295 298 311 336-338* 342 
349 351-352 354 372-373 383-385 
391 400 410 415-416 424 426-427 
431 434-437 439 445 4S4 461 473 
477 483 491 493 497-498 503 516 
5X9 527 535 546 549 552 572-573 
581 588 595 600 602 608-610 620 
628-630 637 645-646 670 679 703 
713 715 719 732 734 744-746 758 
773-778 789 816 829 837 845 848 
869 875 883 898 904 912 922-923 
930-931 942 948 952 965 967 969 
976-977 981 990 992-993 1001 
1004 1049 1055 1059 1071-1072 
1076 1112-1113 1115 1121 1127 
1134-113S 1151 1158 1163 1175 
1181 1188 1209 1218 1224-1225 
1227 1231 1243 1270-1271 1274 
1280 1285 1290 1293 1307 1324- 
132S 1327 1330 1342-1343 1345 
1348 1365-1366 1369 1378-1379 
1387 1398 1400 1405 1417 1425- 
1426 1436 1440-1441 1444 1454 
1463-1464 1488 1491 1507 1512 
1538 1546 1567 1573-1575 1588 
1598 1609 1614 1618 1622 1624 
1627 1634 1636 1649 1651 1658 
1671 1674 1678-1679 1691-1692 
1703 1717 1727 1731-1732 1737 
1765 



GIBCO 



AHR001 



4-8 10-11 15-16 
46 50-52 57-58 
85 87 89 94 97 
110 112 114 116 
127 130-132 134 
147-151 153 163 
186 192 195 197 
215 220 225-226 
236 251 257-260 
277 280-282 285 
298-301 304 307 
325 330 333 336 
352 354 35B 361 
384 387-388 391 
408-409 411-412 



433-439 445-446 
457 459 462 469 
483-484 487-490 
503 506 508 510 
526 534 536-540 
560-562 574-577 
587 589 593 595 
612 615-620 622- 
645-652 656-660 
674-675 683-684 
701 709 712 715- 



18-21 3 
60 62-63 
100 103- 
118-119 
136-138 
164 168 
199 204 
229-230 
262 265 
-286 289 
309 314 
-338 345 
368 370 
393 397 
414-416 
449 452 
472-473 
492-493 
-513 516 
542 546 
581-582 
597 604- 
623 626 
665-666 
687 692- 
716 719- 



4-39 44- 

71 75 82 
104 108- 
122-123 
141-144 
171 179 
-205 212- 
232 234- 
272 274 
•292 296 
321 324- 
349 351- 
380 383* 
401 406 
430-431 
454-455 
476-480 
496-498 
519-522 
549 553 
584 586- 
-609 611- 
632 63 7 
670-672 
-694 697 
-720 725- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



726 728 730-732 735 738-739 743- 
744 746 7S1 753 759 761 765 770- 
771 775-780 785 788-790 796 802 
804 810 812 817 821 826 828 830 
837 843 845-847 849-853 857-861 
863-864 869 871 875 877-879 881 
883 887 890-892 894-895 837-898 
901 903 906-907 911-913 915 919 
921-925 927-928 933-935 945 958 
961-963 967 969-972 975 977-978 
980-986 990 992 999-1002 1005- 
1007 1010 1016 1019-1020 1022- ' 
1023 1025 1028-1037 1039-1040 
1043 1047 1050 1054-1055 1057 
1059 1063-1064 1067-1068 1070 
1072 1075-1076 1083 1085-1087 
1089 1093-1094 1104 1106 110B- 
1109 1113 1116-1117 1119 1121 
1124 1126 1128 1131-1134 1144- 
1145 1148-1149 1151 1158 1167 
1169-1170 1175 1177 1192 1196 
1199-1200 1202 1206-1208 1211 
1216 1218 1222 1227-1229 1232- 
1235 1238-1241 1243-1244 1247- 
1248 1250 1253-1254 1256-1258 
1261 1268 1270-1271 1277 1280- 
1282 1287 1292 1298-1299 1306 
1308 1317-1321 1324-1325 1330 
1332 1334-1337 1339 1344-1345 
1349-1350 13S4-1356 1359-1360 
1365-1366 1369 1371 1374-1375 
1378-1380 1383-1384 1389 1397 
1400 1403 1409 1417 1423-1426 
1437 1439 1442 1444 1446-1447 
1450 14S3 1468 1470 1473 1479 
1481 1488 1490 1501-1504 1519 
1521 1524 1528 1530-1534 1536- 
1537 1539 1541-1542 1547 1553 
1555 1560 1565 1567-1571 1588 
1591 1597-1598 1601-1602 1605 
1614-1616 1619-1620 1623-1628 
1630-1632 1634 1636 1641 1644- 
1645 1647 1649 1652-1655 1659 
1662 1667 1673-1674 1680-1681 
1684 1686-1688 1704-1705 1709 
1711-1712 1717 1724 1726-1727 
1731-1733 1737-1738 1741 1743- 
1744 1749 1754-1755 1760-1761 
1765 1772 1785 



adult kidney 



GIBCO 



AKD001 



4-8 10-11 17-21 29-31 35 i 39 42- 
45 50-51 56-58 60-61 64 58-69 75 
77 80 82 35 87 92-94 97 100 102- 
104 107-108 112 116-117 119 123 
127-133 136-137 139-141 143-144 
147-154 157 161-163 165-166 169 
172 176 178-179 192 194-197 199 
201 203-206 209-210 212-213 215- 
216 223-228 234-236 238 247 251- 
253 257-259 261-262 265-269 271- 
272 274 276-277 279-281 234-286 
290 293 295 298-299 301-302 304 
307 311-313 321 325-326 329-331 
333 341 344 348-350 352 356 358- 
359 362 364-365 368 370-372 374 
376-377 380-382 392 395 398 400- 
401 404 407-409 414-415 423-424 
430-437 443-444 446 449 451 453- 
4SS 459 461-462 464 467 469 471- 
474 476-477 480-481 483 487-488 
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Tissue Origin 


RNA Source 


Hyseq 






SEQ 


XD NOS; 








Library Name 




















490- 


491 4 


93 497-505 


510- 


513 516- 








520 


522 5 


24 52 


6-529 


534 


537-540 








544 


547 549 554-556 


560 


562 564 








567 


571-576 578 582 


586- 


589 592- 








593 


598-599 601 604 


-606 


6C8-613 








615- 


619 6 


21-62 


6 532 


-634 


637-643 








645- 


652 655 66 


0-664 


669- 


672 676 








678- 


679 688 692-695 


698 


702 711 








713 


717 7 


19-72 


0 727 


731 


735-736 








738 


743 745-746 751 


753 


755 762- 








763 


765 7 


71-773 775 


-778 


780 786 








788 


793 7 


95-796 800 


803 


805 808 








810- 


012 8 


14-61 


9 821 


826 


829 832 








834- 


838 842-845 848 




857-861 








864- 


865 8 


67 869 871 


874 


O / D- OOJ 








886- 


887 889-891 893 


— ay o 










902 


906-908 91 


0-914 


Ol Q 










;»*d— 


927 929-93 


5 937 


94 0 - 










OA Q 


949 9S1 953-958 


960- 


961 963- 










969-970 972 976 


_ fi-t p 
- 3 / O 


70^ - ?oo 








988- 


990 992-993 995 


-997 










1004 


-1008 


1010 


1012 


-1013 


1016 - 








i m *7 
1U1 / 


1019 


-1020 


1022 


1025 


-1031 








1035 


1038 


-1040 


1042 


1044 


1 DAT 








1050 


1054 


-10SS 


1057 


-1064 


1068 








1070 


-1073 


1078 


1085 


-1086 


1088- 








1089 


1092 


1094 


1097 


10 99 


- X 








1107 


1109 


-1112 


1116 


„ nig 


1121 








1123 


-1125 


1132 


-1135 


1140 


1142- 








1143 


1146 


-1147 


1149 




1153 - 








1154 


1157 


1159 


1163 


X JLO / 


XX / v 








1178 


-1179 


1181 


1183 


1192 


11 96 - 








1200 


1202 


-1204 


1206 


-1211 


1216- 








1219 


1221 


-1222 


1225 


1227 


-1230 








1232 


-1234 


1238 


-1241 


1243 


-1244 








1246 


-1247 


1253 


1257 


-1258 


1260 - 








1261 


1267 


-1268 


1270 


12 72 


-1274 








1281 


1283 


1287 


-1239 


1^3 J 


-1295 








12 99 


1306 


1308 


1311- 


-13 13 


13 17 - 








1320 


1323 


1329 


-1330 


133 4 


-1335 








1339 


1341 


1349- 


-1350 


1353 


-1357 








1359 


1367 


1369 


1373 


13 75 


1378 - 








1379 


1394 


1397 


1400 




1405 








1407 


-1409 


1417 


1419 


•J AO"* 


-1424 










-1431 


1433 


1437- 


* ±*k J 0 


1 A AO - 








14 43 


1445- 


-1446 


1448- 












1454 


1459 


1461 


1465- 


■1468 


1 A HA — 
X*l / fx — 








1475 


1478 


1484-1488 


1490 


14 92- 








14 93 


1495 


1497- 


•1498 


1506 


-1507 








1S09 


1S12 


1518 


1521- 


•1522 


1525 








1527 


-1528 


1532- 


-1533 


1537 


1540- 








1541 


1547-1550 


1552 


1556 


-1559 








1551 


1565- 


-1566 


1568 


1571 


1575 








1578- 


■1579 


1583 


1586- 


1587 


1589 








1591- 


-1592 


1594 


1598 


1600 


1603- 








1604 


1606 


1608 


1611 


1613 


1615- 








1616 


1618- 


1622 


1624- 


1628 


1631- 








1632 


1634- 


1636 


1638- 


163 9 


1641 








1644 


164 6- 


1649 


1653- 


1656 


1662 








1664 


1666- 


1667 


1670- 


1671 


1676- 








1679 


1683- 


1684 


1686 


1691- 


-1692 








1696-1699 


1701 


1709- 


1711 


1713- 








1714 


1716- 


1719 


1723- 


1724 


1726- 








1727 


1733 


1737- 


1738 


1741 


1743- 








1744 


174 8- 


1749 


1751 


1760- 


■1761 








1763- 


1768 


1778 


1780 


1785 




adult kidney 


Invitrogen 


AKT002 


20-21 37-39 47 


52 57 


60 65-66 








68-69 80 104 107-108 


122 


130 133 








136-137 140 142 


-143 


149 169 174 
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Tissue Origin 



RNA Source 



Hyseq 
library Name 



SEQ ID NOS: 



181 197 227-228 235-236 244 251' 
261-265 267 280-281 286 290 299 
301 304-305 309 312-313 339 341 
344-345 349 358 370-372 376 382- 
383 387 392 401 414 416 421 430 
443 445 449 453-454 472 437-488 
504 S06 513 516 519 522 528 536- 
540 546 554 585 587 594 598 602 
607 616-617 626-627 636 643 662- 
664 69S 709 721 735 743 761 768 
775-777 788 796 804 814 827 837- 
83S 849-850 852-853 869-870 B81 
890-892 898 903 905-907 914 919 
925 927 934 941 949 952 957 960 
962 968 970 1000 1008 1029-1030 
1044 1052 1055 1063 1067-1068 
1073 1085 1099-1102 1107 1110- 
1111 1113 1115 1119 1126 1134 
1136-1137 1146-1148 1153 1159 
1192 1196 1199 1232-1233 1241 
1256 1264 1272-1273 1281 1285 
1293-1294 1299 1312 1320 1324- 
1325 1330 1344 1349 1351 1355- 
1356 1369 1378-1379 1403 1414 
1419 1428-1429 1436 1446 1458 
1463-1464 1467-1468 1470 1477- 
1478 1486 1491 1509 1519 1527 
1529 1534 1547 1596 1600 1619 
1623 1629 1631 1634 1638 1643 
1647 1652 1660 1664 1667 1669- 
1670 1673 1686 1709 1727 1740 
1776 



adult lung 



GIBCO 



ALG0O1 



lymph node 



4-8 14 37-39 44-46 
63 75 82 88 93 103 
133 140 143 150 152 
171-172 174-175 190 
211 214 219 223-224 
252 256 265 272 274 
310 332 345 351 362 
394 408-409 431 436 
461 467 469 471 476 
513 527 537-540 544 
564 583 607 616-617 
634 645-646 662-664 
719 743-744 763 766 
811 814 817 831-832 
852-853 858-859 861 
901 905 941 954-957 
979 9B1 987 990 992 
1005-1006 1014 1017 
1054 1059 1062 1064 
1086-1089 1094 1107 
1136-1137 1142 1150 
1190 1200 1208 1220 
1273 1280 1282 1295 
1331-1332 1353 1374 
1384 1404 1409 1423 
1442 1474 1478 1494 
1525 1531-1532 1547 
1554 1571 1598 1606 
1627-1629 1632 1642 
1669 1676-1677 1684 
1731-1732 1737-1738 
1786 



50-51 56 62- 
104 113 125 
154 157 162 
191 196 200 
227-228 251- 
280-281 285 
371 381-382 
445 454 459 
-477 488 504 
547-S48 554 
621 623-624 
670 695 716 
774 789 803 
B37-838 845 
866 880 887 
966 971 977 
996 1001 
1045 1047 
1072 1080 
1126 1134 
1157 1173 
1241 1272- 
1306 1320 
1379 1383- 
1434 1436 
1509 1522 
1549 1553- 
1613 1624 
1644 1662 
1696 1727 
1748-1749 



Clontech 



ALN001 



4 24 50-51 82 105 137 153 198 
201 223-224 234 268-269 272 280- 
231 287 301 312 329 343 382 421 
430 433 445 451 461-462 475 481- 
482 503 526 529 537-S40 546-547 
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Tissue Origin 



UNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



young liver 



621 626 649 679 719 
793 803 831 834-836 
858 866 879 905 913 
1005-1006 1012 1038 
1117 1151 1199 1204 
1265 1274 1324-1325 
1374 1377 1440-1441 
1549 1600 1618-1619 
1644 1653 1687-1688 
1741 1771 



725-726 738 
838 844 857- 
928 963 976 
1050 1116- 
1226 1243 
1339 1353 
1447 1504 
1631 1641 
1691-1692 



GIBCO 



ALV001 



adultliver" 



Ir.vatrogen 



ALV002 



5-8 11 20-21 46 50-51 58 65-66 
75 79 82 93 97 102-103 108 110 
116 139 143-144 148-149 171-172 
174 187-189 194-195 198 209 214- 
215 230 250 258 267-269 280-281 
306 309 342 351 356 359 362 372 
374 392 394 398 401 407-408 410 
414 431 444 455 459 476 478 483 
493 510-512 515 520 522 526 536 
549 571 574-577 585 592 601-602 
607 621-624 628-630 632-633 637 
648 660 666-667 678 697-698 700 
717 719 728 730 734 738 744-745 
766 770 773 779 788 800 808 B12 
814 841 849-851 871 874 879 887 
893 898-900 902-904 906-907 911 
919 922 924 934 953 957 963 965 
970 984 986 997 1001 1004 1007 
1012 1029-1030 1033-1034 1052 
1061 1066 1070 1076 1086 1089 
1093 1099-1102 1110-1112 1116- 
1117 1119 1121 1125 1136-1137 
1144-1145 1156-1157 1159 1196 
1199-1200 1209 1211 1219-1220 
1241 1244 1262 1270 1275 1279 
1283 1295 1317-1320 1332 1339 
1344 1359 1362-13S3 1379 1383- 
1384 1403 1415 1430-1431 1437 
1450 1467 1475-1476 1483-1484 
1494-1495 1498 1505 1S12 1516 
1518-1519 1526 1529 1547 1550- 
1552 1557-1559 1555 1583 1587 
1597 1609 1614 1620 1631 1637 
1641 1644 16S4-1655 1662 1667 
1669 1684 1691-1692 1702 1711 
1725 1738 1741 1743-1744 1758 
1760-1761 1763-1765 1769 



5-8 17 20-21 32-33 41 55 58 64 
75 77 86 89 102 108 117 119 175- 
176 198 200 209 231 235-236 250 
272 275-27S 284 306 316 321 325 
333 356 359 374 376 398 401 408 
414 428 430 433-435 454 476 494 
503-505 517-518 528 534 544 552 
561-563 567 578 581 608-609 630 
632 637 644 650 661 665 672 702 
707 710 721-722 750 753 778 782 
794 814 820 826 834-837 847 849- 
850 858 861 874 879 893 898 904 
911 918 921-922 926 946 94B 972 
978 986 996 1020 1027 1031 1034 
1053 1063 1068 1070 1073 1086 
1089 1093 1097 1113 1119 1156 
1159 1195 1198-1199 1208 1220 
1227 1241 1261 1272-1273 1277 
1285 1308 1315 1320 1324-1325 
1330 1362-1363 1375 1403 1408- 
1409 1415 1431-1432 1435 1467 
1469 1482 1504 1524 1542 1547 
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Tissue Origin 



RNA Sou'rce 



Hyseq 
Librae Name 



SEQ ID NOS: 



1550 

1597' 

1618- 

1647 

1669- 

1738 

1765 



Clontech 



1567 

1601- 

1619 

1S52 

1671 

1742- 

1772 



1578 1581 
1602 1611- 
1621 1625 
1654-1655 
1684 1706 
1744 1760- 
1774 



1583 1594 
1612 1615 
1637 1645 
1660 1666 
1722 1737- 
1761 1753- 



adult liver 
adult ovary 



ALV003 



29 676 997 1063 1119 1536 1766 



Invxtrogea 



AOV001 



1 4-18 20-23 29 35-40 42-48 50- 
51 53-58 61-63 65-66 68-69 73-75 
77-78 80 82 85 87 89 97 100-101 
103-104 106-108 110 113 115 118 
122-124 126 128 133-134 136-140 
142 145-147 149-157 161 166 168- 
170 174 177-173 180 182-186 188- 
189 192-203 207 209 211-215 219 
221-224 229-230 234 242-243 246- 
247 255 258 260-262 265-269 271- 
272 274 277-281 284-286 288 290 
295 299 301-302 304 307 309-311 
313-314 316 321 323-326 330 332- 
333 335-338 341 344 349 352-353 
356 358 360 362 370-372 376-377 
379-384 387 390-392 394 397-398 
400 403 408-410 412 414-416 423- 
424 426-427 430-435 439 443-446 
448-449 451 453-455 462-463 468- 
471 473 476-479 481-484 487 489- 
494 496-497 499-501 503-505 509- 
514 516-517 519-520 522 524 526 
528-534 541-544 546-547 549 552 
554-5S5 561-564 566-567 569-570 
572-573 575-576 579 581 S83 585- 
588 590-591 593 595 597 599 601- 
605 607-613 615 618-622 624-627 
630 632-633 63S-640 642 644-647 
649-652 654-655 657-665 667-675 
677-678 681 683-684 692-S95 697- 
710 714-721 723 725-727 729 732 
734-735 743-746 750-751 753 758 
763 765 767 772-773 775-778 780 
783-784 786 788 790-791 794-796 " 
800 803 805 809-811 813-815 818- 
819 821-824 826 828-829 831-832 
837-838 843-850 852-857 859-864 
867 S69 871-872 874-875 878-883 
887-888 890-895 898-910 912-914 
916 919-922 924 926-927 929-939 
941 943-946 948-951 953 955-958 
961-964 966-967 970-979 981-982 
985-986 988-990 992 995-997 999- 
1001 1004-1009 1011-1013 1016 
1019-1020 1024-1025 1029-1031 
1033-1035 1037 1039 1041-1047 
1050-1051 1054-1060 1062-1064 
1067-1070 1072-1073 1075-1076 
1078-1079 1085-1086 1089-1090 
1094-1096 1098-1103 11C6-1108 
1112-1117 1119-1120 1123-1127 
1131-1135 1142-1143 1146-1149 
1153 1156 1158 1163 1165-1166 
1169-1171 1173-1175 1177-1178 
1180 1163-1185 1190-1191 1195 
1197-1200 1202 1205-1214 1217- 
1219 1221-1226 1232-1235 1238- 
1241 1243-1244 1247 1249 1252- 
1254 1256-1258 1262 1265 1267- 
-1268 1270 1275 1278 1280-1283 
1286-1289 1291 1293-1294 129B- 
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Tissue Origin I RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



12 99 


1306 


1308 


1312 


1317- 


1321 


1323 


1327 


1329-1330 


1332- 




133 8 - 




1341 


1343- 


•1351 


1356 


1359 




1365- 


-1366 


1371- 


1375 


1377 - 


■137S 


1383- 


-1384 


1386 




13 94 


1400 


1404 


1416-1417 




1427 


1429- 


1431 


1435- 


-1436 


1439- 


1443 


1445- 


1450 


1453- 


-1454 


1459 


1463- 


■1464 


1456 


1468 


1470 


1474 — 


14 81 


1484- 


1485 


1488 


1491 


1493 - 


1494 


1496- 


1498 


1501- 


■1504 


1506- 


1507 


1511- 


1517 


1519 


1521- 


1524 


1526- 


-1527 


1530- 


-1531 


1534- 


-1536 


.1538- 


•1539 


1541 


1546 


1548- 


-1550 


1553 


1555- 


1559 


1561- 


-1563 


1566- 


1567 


1569- 


1570 


1572 


1574- 


-1575 


15/8 


1580-1581 


1587- 


-1588 


1590- 


1591 


159S 


1597- 


-1S98 


1600- 


-1606 


1609 


1611- 


1621 


1623 


-1630 


1634 


1636 


1638 


1641 


1643 


1645 


1647- 


1657 


1659- 


1662 


1664 


1667 


1669- 


1671 


1673- 


-1674 


1676 


-1681 


1633- 


1690 


1699 


1702 


-1707 


1710-1711 


1713- 


-1714 


1716 


-1719 


1723- 


-1724 


1726 


-1728 


1731 


-1733 


1735 


1737- 


1738 


1740- 


-1741 


1743 


-1744 


1748- 


1751 


1753 


1755-1756 


1760- 


-1762 


1765 


1767- 


-176B 


1770 


-1771 


1776 


1778 


-1779 


1783 


-1784 


1786 





adult placenta 



Clontech 



APL001 



5-8 44-45 90-91 107 
311 351 414 476 503 
636 719 755 773 8S0 
947 955-956 962 990 
1045 1202 1320 1369 
1713-1714 1743-1744 



•108 159 178 
545 574 624 
890-891 924 
992 1002 
1628 1686 



placenta 



Invitrogen 



APL002 



14-16 26 29 43 60-6 
106 116 135 171 177 
198 210 216 235-236 
309 329 334 339 359 
423 430 434-435 448 
491 517 522 631 723 
738 746 769 818 843 
858 916 948 953-954 
1005-1006 1013 1033 
1068 1070 1086 1139 
1160 1277 1285 1317 
1345 1429 1435 1438 
1486 1490 1512 1519 
1592-1593 1602 1626 
1664 1673 1675 1722 
1746 1776 



1 79-80 103 
180 194 196 
272 290 299 
379-380 417 
454 483 490- 
725-726 728 
854-855 857- 
976 988-989 
1036 1064 
1144-1145 
-1320 1343 
1454 1482 
1532 1549 
1647 1649 
1727 1730 



adult spleen 



GIBCO 



ASP001 



3 5- 


■8 12 15-16 19-21 24 


29 34-36 


44-45 57 60 


82-83 87 89 


94 


98-99 


103 


106 


108 


117 


119- 


■121 


139 


141 


147 


152- 


-153 


155 


166 


169 


171 


174 


178- 


-180 


196 


198 


201- 


-206 


209 


-211 


215 


219 


234 


253- 


-254 


256 


258 


264 


272 


280- 


■281 


290 


295 


302 


309 


312 


325 


333 


341 


349 


358 


372 


382 


386- 


387 


394 


406 


414 


431 


434 


-436 


446 


448 


451 


473 


481 


490-493 


500 


503 


505 


517 


519 


530 


534 


536 


-540 


547 


554 


557 


574- 


-576 


582 


592 


595 


604 


611- 


•612 


620- 


-621 


623 


631- 


-632 


642 


652 


659 


661 


667 


671 


673 


-675 


684 


700 


721 


728 


730 


732 


73 8 


742 


-744 


746 


762 


765 


774 


780 


788- 


-789 


794 


810-811 


817 


822 


830 


832 


845 


848 


852-853 


858 


862 


866 


874 


879 


882 
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Tissue Origin 



RNA Source 



Hy3eq 
Library Name 



SEQ ID NOS: 



884 906-908 912 919 
927 934 947- 949 957- 
978 983 990 992-994 
100S-1007 1010 1012 
1042-1044 1046 1049 
1070 1076 1089-1090 
1109 1113 111S 1124 
1170 1174 1177 1190 
1220 1226-1227 1229 
1246 1258 1269 1271 
1301 1320 1322 1330 
1339 1349 1351 1353 
1364 1369 1374 1386 
1417 1434 1436-1437 
1474 1477 1480 1485 
1512 1522 1525 1544 
1560 1567 1591 1600 
1651 1654-1655 16S8 
1674 1678-1679 1684 
1727 1733 1738 1740 
1761 1774 1779 1781 



921-923 926- 
958 963 977- 
996-997 999 
1031 1036 
1059 1068 
1094 1103 
1140 1163 
1196 1219- 
1236 1241 
1274 1295 
1334-1335 
1359-1360 
1397 1413 
1439 1468 

-1487 1498 

-1549 1553 
1631 163S 
1662 1670 
1686 1700 

-1741 1760- 

-1782 



testis 



GIBCO 



ATS 001 



5-8 10 26 30-31 47 50-51 57 68- 
69 82 84-85 97 102 113 119 137 
139 150 152 154 156 163 169 174 
176-177 192 194 196-197 212-215 
227-228 247 255 258 261 282 285 
288-289 301 307 311 316 330 334 
349 370-372 392 398 410 415 426- 
427 430-431 433 437 446 454 461 
469 473 477 481-482 493 499 502- 
503 513 522 526 547 552-553 563- 
564 572-573 575-576 581-582 585 
599-602 60S 612 €15-617 620 631 
637 647 649-650 656 660 665 670 
674-675 712 719-721 723 728 731 
738 744 746 773 780 784 78J8-789 
802 804 809 811 814 826 831 837 
843 845 848 859 866 859 877 905 
913 916 919 921 926 929 937 950 
960 963 971 975 977 9B1 990 992- 
993 1007 1016 1029-1030 1034- 
1035 1038-1039 1045 1059-1060 
1064 1070 1072-1073 1087 1089 
1097 1099-1102 1104 1108 1113 
1141 1149 1161-1162 1175 1208- 
1209 1222 1227 1229 1231 1235 
1238-1239 1243 1253 1285 1287- 
1289 1291-1293 1307 1311 1317- 
1320 1330 1332 1338 1345 1369 
1373-1374 1379 1389 1399-1400 
1409 1423-1424 1430 1435-1437 
1443 1459 1484 1486 1490 1493 
1496-1497 1501 1505 1509-1513 
1527 1530-1531 1533 1537 1546 
1549 1563 1565 1567 1569 1571 
1577 1586 1591 1599 1602 1625 
1628 1630-1632 1636 1639 1642 
1649 1661-1662 1666-1667 1670 
1675' 1684 1690 1699 1705 1712 
1717 1724 1730 1737-1738 1752 

1767 1779 

686 1352 1412 



Genomic DNA 
from BAC 63118 



Research 
Genetics 
(CITB BAC 
Library) 



BACOOl^ 



Genomic DNA 
from BAC 39316 



Research 
Genetics 
(CITB BAC 
Library) 



BAC002 



1411-1412 
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Tissue Origin 



Genomic DNA 
from SAC 333 16 



adult bladder 



RNA Source 



Research 
Genetics 
(CITB BAC 
Library) 



Invitrogen 



bone marrow 



Clontech 



Hyseq 
Library Name 



BAC003 



BLD001 



BMD001 



SEQ ID NOS : 



1352 



5-8 17-18 22-23 33 37-39 S6-57 
80 93 100 120-121 169 201 237 
251-252 272 278 311 348 3S3 382 
413 415 424 430 443 483 502 542- 
543 562 564 607 616-617 626 635 
652 667 671 710 727 755-756 762 
773 786 789 837 840 866 893 898 
909 918 929 966 977 983 1016 
102S 1055 1073 1082 1140 1167 
1185 1189 1199 1270 1369 1481 
1536 1560 1573 1596 1614 1636- 
1637 1649-1550 1654-1655 1658 
1669 1671 1690 1719 1727 1731- 
1732 1739 1741 1760-1761 1779 



3-8 11 13 18 29-31 33 35-36 40 
43-45 47-48 50-51 57 60 65-66 75 
80 82 85 88-89 94 100 103 107 
110 115 118-119 124-125 133-134 
136-137 139-141 146 150 152-153 
155 161 163 168-170 172 178-180 
187 192-193 197-198 203-205 210- 
213 215 217 219 222 224-226 233 
235-237 242-244 255 258 260 263- 
264 266 273 276 278 283 286 290 
295 301-302 307 312-313 321 330 
333 339 343 352 357-358 370-371 
382 384-385 387 389 394 408 410 
412 416 421 424-427 429-431 436- 
437 439 441-442 445 447 454-456 
461-462 471-472 475 477-479 481- 
482 485 488 493 498 500 503-506 
513 516 519 523-524 526 530 535- 
540 542 544-545 549 555 565 567 
569-577 581 583-586 5B8 593 601 
603-604 608-609 613-619 621-622 
632-633 636-637 642 649-650 656- 
660 666 670 672 674-675 679 683 
701 708 716 718-720 731 735-736 
740-742 744-745 752 761 765 772- 
773 775-778 780 785-786 789-791 
796 798 802 810-812 823-824 826 
830 832-833 837-838 843-844 848- 
855 958-859 866-867 869 878-880 
883 890-892 896 903 905 908 912- 
914 922-924 927 930-931 937 939- 
941 952-953 955-958 963 969 973 
976 981 985 987 990 992 935 1000 
1002 1005-1007 1013 1016 1025 
1028-1031 1033 1035 1037 1039 
1042 1044 1047 1050 1053-1054 
1059 1061 1063 1066 1070-1071 
1079 1106 1110-1113 1115-1117 
1124 1126 1134-1135 1142 1144- 
1145 1163 1172 1178 1197 1199- 
1200 1202 1216-1217 1224 1227- 
1228 1240 1246 1254 1251 1266 
1270 1278 1281 1295 1287 1290- 
1291 1293 1299-1301 1308 1314 
1317-1320 1327 1331 1339 1343 
1346 1349 1353 1356 1361 1367 
1369 1372-1374 1379-1380 1394 
1400 1403 1406 1408 1413 1417 
1419 1423 1425-1427 1430-1431 
1433 1439 1443 1446-1449 1459 
1463-1464 1482 i486 1493-1494 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID WOS: 



1506 


1509 


1513 


iS21- 


-1522 


1524 


1526 


1528 


1531 


1536- 


-1537 


1543 


1S46 


1548-1549 


1552 


1554-1555 


1557- 


-1559 


1571- 


-1572 


1581 


1589 


1592 


1597- 


-1600 


1609 


1614 


1621 


1626- 


■1628 


1630- 


-1632 


1634 


1636 


1638- 


•1639 


1641 


1646- 


-1647 


1651 


1653- 


1655 


1661- 


-1662 


1676- 


-1681 


1684 


1686 


1690 


1702 


1707 


1711 


1713- 


-1714 


1717 


1720 


1722- 


-1723 


1727 


1737- 


-1738 


1740 


1758 


1767 


1772 


1781- 


-1782 


1785- 


-1786 





Clontech 



bone marrow 



BMD002 



bone marrow 
bone marrow 
adult colon 



11 15-16 19 30-31 35-36 68-69 75 
83-84 93 99 103 108-109 118 137 
139 169-170 174 177 180 190 193 
212-213 219 222 225-226 232 237 
255 259 264 273-274 284 286 290- 
292 295 301 303-304 307 312-313 
316 324 326 330 334-335 348 352- 
353 357 360 370-373 384 386-387 
397 403-404 414-416 421 425-427 
429-430 433-436 440 444 451 454 
465-466 472 475 478 491 493 516 
520 523 525 531 545 548 552 566 
569-570 581 583 590-591 597-598 
601 616-617 621 641 650 652 656 
659 671 674-675 679 684 710 718- 
719 728 734 737-738 742 761 765 
774-778 790 811 814 818 B30 834- 
836 854-855 859 856 869 871 878- 
879 884 889 892 904 922-923 932 
990 992 998 1001 1004 1016 1036 
1042 1048 1051 1054-1055 1058 
1088-1089 1106 1112-1114 1155 
1157 1192 1200 1223 1227-1228 
1236-1237 1260-1261 1282-1283 
1285 1287 1295 1314 1317-1321 
1324-1327 1330 1333 1341 1343 
1347 1350 1353 135S-13S7 1367 
1369-1370 1373 1377 1379 1381 
1383-1384 1394 1397 1400 1406 
1413 1417 1425-1427 1438 1442 
1446 1459-1460 1470 1493 1505 
1521 1536 1546-1549 1550 1573- 
1574- 1578 1598-1600 1621 1626 
1631 1634 1646 1649 1653 1656 
1658 1669-1670 16B3-1684 1687- 
1688 1690-1693 1696 1699 1702 
1704 1707-1709 1711 1720 1722- 
1723 1725 1727 1729 1731-1733 
1738-1740 1743-1746 1752 1755 
1760-1761 1767 1777 1781-1782 
1786 



Clontech 



BMO004 



73-74 503 922 1036 1711 



Clontech 



EMD007 



95-96 866 1320 1475 



Invitrogen 



CLN0O1 \ 17 56-58 103 110 117 144 150 171 
179 185 188-189 201 204-206 210 
218-221 22S-226 231 237 251 277 
288 310 312 320 333 359 386 3B8 
394 408 420 455 481 485 503 510- 
512 590-591 615 635 647-648 665 
672 684 697 710 725-726 743 780 
786 788 826*827 848-850 8S4-855 
858 866 872 898 918 921-923 953 
976 983 993 1005-1006 1017 1020 
1025 1027 1054-1055 1063 1068- 
1069 1140 1153 1170 1185 1196 
1199 1220 1280 1314-1315 1320 
1345 1351 1355 1369 1428 1439 
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Tissue Origin 



Mixture of 16 
tissues - 
mRNAs 



Mixture of 16 
tissues - 
mRNAs" 



RNA Source 



Various 
Vendors 



adult cervix 



Various 
Vendors 



RioChain 



Hyseq 
Library Name 



CTL016 



CTL021 



CVX001 



SEQ ID NOS: 



1462-1464 1512 1556 1583 1587*" 

1554 1596 1614 1625-1626 1631 

1639 1645 1650 1675-1677 1687- 

1688 1701 1713-1714 1724 1740 
1765 



401 1490 1686 



312 7B2 1132-1133 1403 1712 1715 



1 4-8 11 13 18-21 25-26 30-31 33 
37-39 43 46-47 58 61 64-66 71 
73-74 82 85 94 100 103-104 113 
118 122 126 130 134 140 147 153- 
156 163 170 179 181 186 192 195- 
136 198 201-202 218-219 222 229- 
231 257 266 276-277 285-286 288 
298 301-302 304 307 312-314 324 
326 329-330 332 335 342 352 358 
362 371-372 376 379 381-382 384 
388 398 400 410 414 416 419-420 
426-427 430-431 433-436 439 446 
448 461-462 464 471-477 479 482- 
483 491 493 496 503 S06 510-513 
516-517 526 530 535 S42-544 546- 
547 557 S61 572-573 575-577 581- 
582 585-586 588-589 593-594 600 
602 604-605 607-609 612 615-619 
623 644 650 654 657-658 662-665 
670 672 680 683 691-694 698 706 
708-709 711 713 720-721 727 729 
731-732 737 745-747 753-754 760 
765 771 774-777 780 790 793 796 
798 800 803 805 818 826 828 831- 
832 834-836 843 S47-848 851-855 
BS7-860 864-866 869 871 876 878- 
880 882 887 890-891 897 899-902 
905-908 912-913 916 918-919 922 
927 932 934-938 944 948 S55-956 
958 963-964 967 969-970 S72 976 
978-979 983 985 990 992 1000 
1005-1007 1016-1017 1024 1027 
1033 1036 103B 1045 1047 1053- 
1056 1066-1067 1071 1073 1075 
1079 1082 1098 1113 1124 1129 
1134 1139 1146-1149 1163 1167 
1170 1173 1175 1177 1181 1197 
1200 1202 1211 1214 1216 1221- 
1222 1225 1227 1232-1234 1240- 
1241 1243 1258 1264-1265 1268 
1270 1279 1287-1290 1308 1310- 
1311 1316 1320 1323 1327 1345 
1349 1353-1354 1360 1372-1374 
1383-1384 1386 1394 1397 1405- 



The 16 nssue-mRNAs and their vendor source, are as Mows: I) Normal adult brain 
mRNA (Invitrogen), 2) normal adult kidney mRNA (Invitrogen), 3) normal adult liver 

<! nv ! trofien) > 4) normal fetal brain mRNA (Invitrogen), 5) normal fetal kidney 
mRNA (Inyrtrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal skin mRNA 
(Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA 
(Clontech), 10) human leukemia lymphablastic mRNA (Clontech), 11) human thymus 

SS 00 "^* 12) hUman lymph nodemRN A (Clontech), 13) human spinal cord 
mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) human esophagus mRNA 
(BioCnam), 16) human conception^ umbilical cord mRNA (BioChain). 
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Tissue Origin 


RNA Source 


Hyseq 




SEQ 


ID NOS: 








Library Name 


















l*UO X*mJ.O X**jL^* 


-1427 1431 1436- 








1437 1442 1446 


1448 1453 1459 








1466 1472 1478 


1482 1496 1501- 








1S03 1506 1512 


1522 1527-1528 








1531 1533 1S41 


1547 1569 1571 








1585 1589 1597- 


-1598 1600 1608- 








1609 1614-1616 


1620 1623-1624 








1626-1628 1630 


1638 1641 1643 








1649 1653 1656 


1662 1667 1669 








1674-1675 1683 


1685-1688 1699 








1702 1709-1710 


1715 1717 1722 








1724 1729 1731- 


1732 1735-1739 








1741 1743-1744 


1748-1749 1755 








176 


0-1762 1767 


1773 1778 1785- 








1786 








diaphragm 


BioChain 


DIA002 


137 


282 289 730 


780 986 


1409 








1478 1599 1614 








endothelial 


Strategene 


EDT001 


3 5 


-10 13 15-21 


24- 


26 29 34 37- 


cells 






39 


42 44-45 50- 


51 53-55 


57-58 








60- 


61 65-66 68- 


69 73-74 


77-78 60 








82- 


83 85 87 89 


93-96 101-105 108 








110 


112-114 116 


118 


-122 


124 128 








133 


-134 137-142 


147 


-150 


152-153 








161 


-163 166-172 


176 


-179 


187 190 








192 


194 196-201 


204 


-207 


210 212- 








214 


220 224 229 


-230 


233 


235-236 








240 


-241 251-252 


258 


261 


-262 265 


• 






267 


-269 272 276 


-277 


279 


-281 284- 








285 


288 290 295 


-296 


301 


-302 310- 








311 


313 316 321 


325 


329 


331-333 








335 


340-342 351 


-355 


360 


371 375 








380 


-382 384 387 


390 


392 


397 400 








407 


-408 410 412 


414 


416 


425-427 








431 


434-436 439 


444 


-445 


449 454 








4oo- 


-464 472-475 


477 


-479 


486 488- 








490 


497-498 500- 


-504 


510- 










519 


522 524 526- 


-528 


532- 


-534 536- 








540 


542-546 548 


561 


-563 


566-567 








572- 


•576 579 581 


585 


-586 


589 593 






• 


595 


597 599 603 


607 


-612 


615-617 








620 


622 626 630 


632 


-634 


638-641 








644 


647 656-660 


662 


-664 


670 673 








678 


680-682 692- 


697 


707 


709-710 








712- 


713 719 730 


732 


734 


736 738 








743- 


746 751 759 


768 


771 


773 775- 








778. 


783 786-789 


793 


800 


8C3 805- 








807 


810-811 814 


816- 


818 


821-822 








824 


826 828-829 


832 


834- 


838 842- 








845 


848-850 854- 


860 


862 


864 869 








871 


874 876-879 


883 


885 


887 890- 








891 


894-895 898- 


900 


903 


908 910- 








913 


916 919-922 


924 


926- 


928 930- 








935 


939 943 948- 


949 


951- 


954 957 








959- 


961 964 969- 


970 


973 


975-978 








983- 


984 988-990 


992- 


993 


996-997 








1000 


1002 1004-1013 


1016 


-1020 








1022 


-1025 1028 1031 


1033 


-1034 








1038 


-1046 3050 1055- 


1056 


1059- 








1060 


1062-1064 1067- 


1070 


1072- 








1074 


1076 1078 1082 


1086 


-1087 








1089 


-1090 1093-1097 


1099 


-1103 








1107 


1109-1113 1116- 


1117 


1124- 








1126 


1128-1131 1134- 


1135 


1138 








1140 


1144-1145 1 


L48- 


1149 


1153 








1157 


1160 1163 1171 


1183- 


-1184 








1198-1199 1202 1205-1207 


1211 








1216- 


-1217 1219 1221 1225 


1229 








1232- 


-1235 1238-1241 1243- 


■1244 








1246 


1250 1253 1257-1258 


1261 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1265-1266 1268 
1277 1280-1283 
1290 1293 1295 
1317-1320 1324 
1330 1334-1335 
1345-1347 1350 
1367 1369 1374 
1400 1406 1408 
1424-1426 1428 
1440-1442 1448 
1468 1472 1474 
1491-1493 1501 
1511 1516 1520 
1S31 1536-1537 
1547 1549 1552 
1561-1565 1568 
1579 1581-1583 
1592 1597 1605 
1615 1618-1621 
1631 1634 1636 
1650 1652-1659 
1669 1671 1675 
1696-1698 1703 
1719 1722-1723 
1736 1739-1741 
1755 1760-1761 
1771-1773 1776 
286 686 1297 1 
1411-1412 1754 



1270-1271 1274- 
1285-1286 1288- 
1298 1308 1312 
-1325 1327 1329- 
1338 1342-1343 
1355-1356 1359 
1376 1379 1398 
1414 1417 1419 
1431 1434-1438 
1450 1462-1466 
1478 1487-1488 
-1504 1506 1509 
-1521 1S26 1529 
1539-1540 1546- 
1555 1557-1559 
1571 1575 1578- 
1587-1588 1590 
-1606 1611 1613 
1624-1628 1630- 
163B 1641 1643- 
1664 1666-1667 
-1681 1683-1688 
1711 1715-1716 
1726 1731-1733 
1743-1744 1749 
1765 1767-1768 
1779 1783-1786 



Genomic clones 
from the short 
arm of 
chromosome 8 



Genomic DNA 
from 
Genetic 
Research 



EPM001 



303-1304 1352 



esophagus 



BioCham 



ESO0O2 



131-132 261 289 380 503 860 892 
1000 1007 1397 



62-63 89 112 126 194 322 336-338 
379 391 411 481 546 563 607 679 
710 867 1012 1031 1055 1251 1262 
1320 1407 1643 1652 1S86 1731- 
1732 1746 1765 



^fetal brain 



CI on tech 



FBR001 



fetal brain 



Clontech 



FBR004 



68-69 90-91 139 212-213 301 331 
362 374 403 436 611 645-646 659 
668 670 691 785 805 845 1163 
1209 1216 1232-1233 1238-1239 
1387 1410 1416 1430 1496 1536 
1547 1593 



fetal brain 



Clontech 



FBRQ06 



5-9 25 43 60 62 
80 87 92 101 103 
149 152-153 157 
207-208 210 212- 
238 251-253 266 
301-302 307 310 
330 333-334 336- 
357 370 373 377 
391-392 397 399 
411 417 421 424 
437 440-443 454 
476 483 488-489 
513 516 519-520 
544 547 550 561 
590-591 595 597 
623 628-629 631 
657-658 660 665 
689 691-694 696- 
710 716 720 728 
744 757-760 763 
806-807 810 817 
858 861 864 871- 
894-895 898 904 
936 938 945 950 
959 961 963 967 



63 65-66 70 
108 114 13 
168 171-172 
213 221-226 
272 279-281 
317-318 321 
338 346-347 
379-380 382 
402 406-408 
426-427 430 
460 464 467 
495 497 508 
524 530 537 
567 572-574 
604 607-609 
634 638-640 
669 674-675 
697 699 701 
732 734 736 
775-778 780 
818 826 839 
872 884 890 
915 921-923 
952 955-956 
969-971 990 



72 

6 139 
175 
237- 
295 
324 
352 
384 
410- 
436- 
473 
510- 
540 
582 
615 
655 
679 
706 
742- 
799 
843 

-891 
935- 
958- 
992 
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TLSsue Origin 



RNA Source 



Hyseq 
library Name 



SEQ ID NOS: 



999 1 

1016 

1035 

1065 

1114 

1151 

1172 

1190- 

1226- 

1253- 

1270- 

1314 

1339 

1371 

1386 

1425- 

1440- 

1502- 

1519 

1559 

1611- 

1640 

1593 

1718 

1730 

1742 

1767 

1786 



001 1005-1006 1 
1022 1024 1029- 
1042 1047-1048 
1067 1070 1082 
1115 1119 1131 
1153-1156 1160 
13 73 1178 1184 
1200 1211 1216 
1227 1229 1231 
1255 1258 1260 
1273 1281 1287 
1317-1320 1326 
1341 1344 1350 
1373 1376 1379 
1392 1396-1398 
1426 1428-1429 
1441 1448 1466 
1503 1507 1511 
1536 1544 1549 
1573 1589-1590 
1614 1619 1621 
1651 1657-1658 
1696 1703-1704 
1720 1722 1724 
1733 1735-1736 
1745 1755 1759 
1771-1772 1777 



008 1013 
1030 1032 
1052 1056 
1089 1109 
1143-1149 
1163 1167 
1186 1188 
1222-1223 
1236 1245 
1262 1266 
1308-1309 
1334-1335 
1356 1369- 
1381-1382 
1419 1423 
1432 1437 
1470 1482 
1513 1516 
1550 1557- 
1598 1608 
1625-1626 
1676-1679 
1713-1714 
1726 1728 
1738-1739 
1761 1765 
1779-1780 



235-236 520 864 1068 1188 1587 



fetal brain 



Clontech 



FBRs03 



fetal brain 



Invitrogen 



PBT002 



15-18 20-21 24-25 29 34 43 61-63 
77-78 98 101 103 107-108 128 130 
136 146 148 165-1S6 171 174 181 
185 196-198 204-205 208 223 230 
235-236 251 253 261 26B-269 280- 
281 284-285 288 309-311 321 329 
334 339 346-347 350 357-359 381- 
383 390 407 418-419 430 434-435 
438 443-444 461 464-466 483 490 
494 509 516 519 522 527 557 561- 
S62 572-573 590-591 595 597 623 
632 647-648 650 655 669-670 672 
682 690-691 700-701 710 717 736 
746 782 784 788-789 814-815 825 
829 840-841 847 8S4-8S5 857-858 
897-900 904 919 925 935-937 946 
948-949 954 960-962 966 969-970 
986 996 1000-1C01 1005-1007 1012 
1014 1022-1028 1045 1052 1055 
1068 1070 1072 1078 1082 1085 
1090 1109 1115 111B 1120'1128 
1136-1137 1144-1145 1149 1156- 
1157 1193-1195 1198 1204-1205 
1220 1222 1234 1257 1262 1271 
1274-1275 1280 1285-1286 1294 
1312 1314 1317-1320 1330 1342 
1344-1345 1349-1350 1355-1356 
1358 1364 1369 1379 1383-1384 
1431 1435 1476 1507 1519 1532 
1536 1547 1554 1564 1567 1578 
1582 1587 1593 1595 1601 1608 
1615 1619-1621 1638 1644 1661 
1665-1666 1673 1687-1688 1690 
1715 1723 1728 1749 1753 1757 
1759-1761 1765 1771 1774 1776 
1778 1781-1782 1786 



fetal heart 



Invitrogen 



PHR001 



105 124 180 289 864 1036 1148 
1229 1614 1616 1762 1785 



fetal kidney 



Clontech 



PKD001 



5-8 11 40 47 57 65-66 82 85 102 
124 163 171 216 222 224 235-236 
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Tissue Origin | RNA Source 



fetal kidney 
fetal kidney 



Hyseq 
Library Name 



SEQ ID NOS: 



258 277 280-281 307 310 314 330 
371 387 392 395 403 422-423 431 
436 443 455 469 500 519 522 542 
563 572-573 585 600 619 623 650 
654 657-658 660 679 719 731 780 
798 821 833 844 854-855 857 864 
868 878 911 929 958 960 969 990 
992 1007 104S 1087 1103 1129 
1139 1285 1312 1331 1355 1369 
1371 1376 1391 1422 1425-1426 
1440-1440. 1470 1543 1598 1601 
1618 1631 1651 1654-1655 1669 
1678-1679 1691-1692 1733 1785 



CI on tech 



PKD002 



Invitrogen 



352 384 426-427 440 583 602 1060 
1131 1324-1325 1636 



FKD007 



fetal lung 



20-21 82 163 335 679 988-989 
1000 1227 1230 1320 1554 



cl on tech 



FLG001 



fetal lung 



35-36 94 323 371 398 426-427 445' 
473 549 560 6 04 616-617 626 631 
649 651 719 746 786-787 832 842 
849-850 864 894-895 1075 1178 
1132 1200 1206 1309 1311 1345 
1429 1493 1567 1576 1620 1686 



Invitrogen 



FLG003 



fetal lung 



fetal liver - 
spleen 



Clontech 



Columbia 
University 



9 15-16 29 41 47 68-59 83 88-89 
102 124 137 152-153 165 196 224 
229 231 249 254 256 267 291-292 
300 325 333 344-345 352 373 376 
379 384 408 425-427 430 432 467- 
468 475 483 488 493 516 531 535 
545 547 549 564 582 602 623 644 
660 662-664 670 €73 725-726 728 
761 766-767 774 805 830 852-853 
864 875 921 932 937 946 949 963 
988-989 1014 1016-1017 1024 1027 
1090 1097 1170 1185 1200 1215- 
1216 1224 1258 1290 1309 1320 
1342 1347 1355 1369 1381 1413- 
1414 1431 1438 1449 1491 1512 
1536 1547 1557-1560 1567 1590 
1601 1636 1544 1653-1655 1652 
1667 1671 1675 1680-1681 1706 
1739 1760-1761 1769 



FLG004 



103 276 334 
1614 165B 



465-466 737 843 1131 



FLS001 



3-11 13 15- 
51 54 56-58 
77-80 82-83 
110 112 116 
135-139 141 
157 163-165 
180 186 188 
200 202-206 
233-236 240 
255-256 258 
274 276-278 
293 295 299- 
311 314 316 
332 342 344- 
358 360 362 
386-387 390 
406 408 410- 
437 439-442 
456 459 461- 
487-488 490- 
506 S09-513 
529 531 534 
553-554 561- 
576 579 581 



21 25 30 
60-66 6 
85 87 8 
-124 126 
144 147 
167-172 
-190 193 
210-214 
-244 246 
261-265 
280-281 
•301 304 
318 320 
-345 350 
370-374 
392-393 
412 415 
444-445 
470 472- 
491 493 
515-520 
535-540 
562 564 
583 585- 



-39 41-48 50- 
8-69 72 75 
9 92-103 105- 

127 130 133 
-149 152-153 
174 176-178 
-194 196 198- 
219 221-231 
-247 250-251 
268-269 272 
284-286 288 
306-307 309 
-321 326 329- 
352-353 356- 
376 378-384 
400-401 403 
417 419 422- 
448 452-454 
479 481-483 
500-501 503- 
522-524 526- 
542 547-549 
567-558 571- 
597 599-605 
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Tissue Origin^ 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



607 610-613 615-621 623-624 626 
628-634 636-640 644 647-650 655- 
660 665 669-670 672 674-675 678 
661-682 684 690-695 697 702 708- 
710 713-714 716-713 725-728 730- 
731 734 736 738 740-741 743-746 
748 750-751 759-766 768 772 7<74- 
777 779 783-788 793 796 798 800- 
805 808 810-812 814 818-819 821- 
824 826-832 834-837 843-847 849- 
867 869-876 878-883 887 889-895 
897-898 902 904-914 916 919 921- 
928 930-937 939 945-950 9S3-958 
960-961 963-965 967 969 971 974- 
978 980-983 986 988-990 992-993 
995-997 1000-1002 1004-1006 1012 
1014 1016-1019 1025-1026 1028- 
1031 1033 1035-1036 1039-1044 
1047 1049-1050 1053-1056 1058- 
1059 1061-1064 1067-1070 1072- 
1074 1076 1078 1082 1085-1087 
1089-1090 1097 1099-1103 1107- 
1113 1115-1119 1121-1123 1125 
1127-1128 1131-1134 1136-1137 
1144-1150 1153 1159-1160 1163 
1170 1175 1177-1178 1188 1190- 
1192 119S-120O 1202 1206 1208- 
1211 1214 1216 1218 1221-1222 
1225 1227 1234 1237 1241 1244 
1246-1247 1251 1254 1258 1261 
1266 1268 1270-1273 1277-1282 
1284-1285 12B7-1290 1294 1299- 
1300 1306-1308 1313-1320 1324- 
1325 1327 1330 1332-1333 1338 
1341 1343 1345-1347 1349-1350 
1353-1360 1362-1363 1365-1367 
1369-1370 1372-1374 1376 1378- 
1381 1383-1384 1386 1389-1391 
1400 1402-1403 1405-1410 1413 
1415 1417-1419 1422-1429 1431 
1435-1437 1439-1442 1445-1446 
1448-1449 1454 1458-1459 1466- 
1470 1472 1474 1477-1478 1480 
1482 1485 1491-1493 1496-1498 
1501-1507 1S09 1511-1512 1516- 
1519 1524-1526 1529 1S32 1536- 
1541 1546-1547 1549-1550 1552- 
1554 1562 1564 1569 1572 1574- 
1575 1578 1581 1583 1587-1588 
1591-1592 1594-1595 1597-1S98 
1600-1604 1611-1612 1614-1615 
1617-1618 1620-1622 1624-1625 
1627-1628 1630-1632 1634-1639 
1645-1651 1653-1662 1664 1667- 
1669 1671 1673-1674 1676-1688 
1690 1696 1701-1703 1706-1709 
1711 1713-1714 1718-1719 1722 
1724-1727 1731-1733 1738 1740- 
1741 1743-1744 1746 1748 1751- 
1752 1754 1760-1765 1767-1773 
1780 1783-1786 



fetal liver- 
splaen 



Columbia 
University 



PLS002 



3-11 13 15-21 26 29 32 35-39 42 
44-45 48 50-51 54-55 57-58 61 54 
68-69 73-75 78 80 82 84 87 95-98 
100 103 105 107-108 110 112-113 
116-119 122-125 128 130 137-138 
145 147-153 155 157 159 161-163 
166 168 171-172 174-175 177 181 
188-189 193-194 196-198 200-203 



124 



WO 01/53312 



PCT/US00/34263 



Tissue Origin I RNA Source 



Hyseq 
Library Name 



206 212-215 219-221 223 225-229 
231-232 240-244 246-247 250-251 
258-259 262 264 268-269 272 275 
277 280-281 284 286 288 290-292 
295 298-299 301-304 306 308-310 
318 320-321 323 325 329 331 334 
342 348-349 352-353 356 359 368 
371 374 376-379 381-384 386-387 
392-393 397-398 400-401 403 410- 
413 421 423 426-427 429-430 433- 
436 438 440 443 445 448 451-452 
454-455 460-463 465-467 469 471- 
473 475-476 478-479 481-483 487 
490-491 493-494 497 500-501 503- 
505 509-513 515-517 519-520 524 
526-531 534 537-542 544 547 552- 
554 556 558 561-562 564-567 571- 
577 583-587 590-591 593 595 597 
601 604-606 608-613 616-617 619- 
624 626-632 634 637-642 644 647 
649-652 654-659 662-665 669-672 
674-675 681-682 685 688 690 696 
698 700-703 707 709-710 713 717 
719-721 723-724 728 731-732 734 
737-738 742-745 748 752 754 759 
763-766 768 770 773-777 780 782 
784 786 791 795-798 801-802 805 
808 811-812 818 823-824 826-827 
332 834-837 839 843 846 848-856 
358-861 865 B67 869 871 873-874 
876 878 881-882 887 889 892 894- 
898 901-902 904 906-908 913-915 
919 921-924 926-932 934-935 937 
939-941 943 946-947 950 953 958 
961 965-967 971 973-975 977-979 
981 984-985 990 992-993 995-997 
999 1001 1004-1007 1009-1011 
1013 1016 1020 1023 1025 1027- 
1031 1033-1035 1039-1042 1044- 
1045 1049 1053 1055-1056 1058- 
1059 1062 1064-1065 1067-1070 
1072-1074 1079 1082 1087 10B9 
1093 1097 1099-1103 1105-1107 
1109-1114 1123 1125-1127 1132- 
1134 1140 1143-1145 1148-1150 
1156 1158 1160 1163 1172-1173 . 
1177-1178 1181-1184 1190-1192 
1195-1197 1199 1204 1206 1208 
1211 1214 1216 1219 1227 1230 
1234-123S 1237 1240-1241 1243 
1245 1247 1256 1258 1260-1261 
1264 1268 1270-1271 1275 1278- 
1279 1284-1286 1288-1289 1299- 
1301 1306 1308 1312 1314 1317- 
1319 1323-1325 1327-1330 1334- 
1335 1339 1343-1347 1349-1350 
1354-1355 1357 1360 1362-1363 
1365-1367 1369 1372 1376 1378- 
1380 1386 1389-1391 1394 1400 
1403 1406 1409 1416-1419 1422- 
1427 1429 1435 1437-1438 1440- 
1442 1446 1448-1450 1453 1460- 
1461 1468 1470 1472 1474-1475 
1478 1482 1486 1490-1493 1496 
1498 1500-1504 1506 1508-1509 
1511-1512 1516 1518-1519 1521 
1524-1528 1531 1536-1538 1543 
1547 1550 1554 1556 1564 1567- 
1569 1580 1587-1588 1591-1592 
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Tissue Origin 



RNA Source 



Hyseg 
library Name 



SEQ ID NOS: 



1600- 
1630- 
1649 
1664 
1683- 
1699 
1717 
1733 
1752 
17S7 
1783- 



"1611- 
1635- 
1654- 

-1669 
1686- 
1707 
1722 
1740 
1760- 
1772- 



1597- 
1618 
1641 
1661- 
1676 
1691 
1713 
1727 
1744 
1763 
1776 



•15~98~ 
■1628 

1646- 
•1662 
-167S 
-1692 
■1714 

1730- 

1748- 
• 1764 

1779 



1601 
1631 
1652 
1667 
1684 
1702 
1719 
1738 
1758 
1769 
•1786 



1612 

1638 

1659 

1674 

1688 

1711 

1726- 

1743- 

1761 

1773 



fetal liver- 
spleen 



Columbia 
University 



FLS003 



103 300 318 321 352 372 379 381 

384 392-393 403 422 424 429 434- 

435 440 444 453 503 515 544 592 
978 1064 1324-1325 1327 1333 

1357 1369 1378 1418 1424 1622 

1646 1649 1680-1681 1689-1G90 
1717 1743-1744 1769 



fetal liver 



Invitrogen 



FLV001 



230 
272 



15-16 26 34 58 61 64 70 75 78 89 
98 105 112 116 120-121 123 133 
151 166 176 180 194-196 198 200 
204-206 210-211 220 225-226 
235-236 239 247 259 261 267 
277 280-281 303 310 313 317 320- 
321 329 344 356 371 374 376 379- 
382 39S 408 412 414 419 429 434- 
435 441-442 465-466 490 494 504- 
506 509 S22 527 534 552-553 562 
567 569-570 572-574 607 631 657- 
658 667 669 672 685-686 702 717 
725-726 732 748 759 761 778 784 
786 809 817 829 837 857 861 872- 
873 875 881 889 894-89S 909 911 
916 954 963 967 974 977 986 988- 
989 993 995 997 1000 1005-1006 
1008 1014-10.15 1020 1042-1043 
1070 1086-1087 1089^1090 1118- 
1119 1122 1144-1145 1148 1153 
1157 1159 1183 1195-1196 1227 
1250 1257-1258 1262 1267 1280 
1285 1307 1312 1314 1317-1320 
1344-1345 1349-1350 1355 13.62- 
1363 1403 1405 1415 1419 1425- 
1426 1429 1431 1442 1448 1463- 
1464 1469-1470 1489 1S28 1536 
1539 1549-1550 1557-1562 1577 
1583 1598 1601 1611 1615 1622 
1644 1649 1666 1674 1706 1721 
1738 1746 1763-1765 1774 1776 
1779 



fetal liver 



C Ion tech 



FLV002 
FLV004 



676 998 1719 

93 133 214 301 3S5 3 
581 601 679 837 847 
1236 1270 1313 1324- 
1355 1367 1425-1426 
1733 1760-1761 



fetal liver 



Clontech 



74 379 555 
859 1123 
1325 1327 
1536 1690 



"fetal muscle 



Invitrogen 



FMS001 



26 37-39 50-51 58 84 
113 128 131-132 139 
194 198 201 206 211 
261 276 282 286 302 
376 379 383 398 412 
436 448 452 462-453 
519 529 561 569-570 
607 623 626 635 647 
725-726 730 733 761 
826 837 860 874 913 
970 980 986 988-990 
1001 1007 1014 1027 
1045 1O60 1064 1070 



86 89 98 
155 172 186 
230-231 256 
325 359 361 
413 419 430 
473 477 503 
590-591 597 
660 672 715 
775-777 788 
915 921 935 
992 1000- 
1035-1036 
1083 1097 



126 



WO 01/53312 



PCT/US00/34263 



Tissue Origin [ RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1099- 
1173 

i2ee 

1324- 

1383- 

1433 

1557- 

1632 

1712 

1766 



1102 
1198 
1270 
1325 
1384 
1505 
1S59 
1644 
1725- 



1116- 
1208 
1277 
1329 
1399- 
1514 
1562 
1650 
1726 



1117 

1229 

129B 

1336- 

1400 

1542 

1589 

1652 

1743- 



1121 
1240 
1317 
1337 
1403 
1551 
1S99 
1671 
1744 



1164 
12S8 
-1320 
1369 
1409 
1554 
1620 
1675 
1754 



fetal muscle 



Invitrogen 



FMS002 



119 221 273 402 426-427 463 547 
599 736 869 1000 1033 1083 1266 
1431 1440-1441 14S8 1545 1599 
1673 1678-1679 1687-1688 1710 
1712-1714 1723 1725 1731-1733 
1743-1744 1760-1761 1767 



fetal skin 



Invitrogen 



FSK001 



1 4-11 15-16 20-23 25 29 33 40 
43 46 56-57 60-61 64-66 75 B2 87 
97-98 105 107-108 113 118-119 
123 133 135-137 139 144 146 148 
151-153 156 163 170 176 180 188- 
189 197-198 200 202-203 210 218 
222 231 246-247 261 263 265-270 
277 285-286 290 293 299 301 307 
311 321 325 328 330 333-335 339 
341 345 351-352 355-356 358-359 
362 368 370 372 376 379-382 384 
388 394 404-405 408-409 411-412 
419-420 424 426-427 436 441-442 
445 448-449 454 462 465-466 472 
476 490 493 504 506 509 515-517 
519 526 531 537-540 547 549 560- 
561 567 572-573 581 584 589 611- 
612 615 623 630-631 635 647 649 
651 657-658 660 662-665 667 669 
672 676 678 681 688 701 704-705 
709-710 713 717 720-721 725-726 
728-729 732 748 750 753 759 764 
766 770 775-777 780-781 786 788- 
789 798 809 811 B14 816-817 822 
824-826 831 842 857 859 861 863- 
864 881 694-895 908 910-911 916 
918 922-923 928 932-933 935 937 
946 948-949 953 960-961 966-967 
970 975 977 986 990 992-993 999- 
1000 1004 1007 1013 1018 1025 
1027 1032 1035 1041-1043 1054 
1057-1058 1060 1062-1064 1069 
1072 1077 1090-1091 1097 1099- 
1103 1108 1113 1119 1123 1128 
1131 1134 1140 1148-1149 1152- 
1153 1156 1163 1167 1178 1182 
1189 1192 1195-1196 1198 1201- 
1205 1208 1211-1212 1216 1219- 
1220 1222 1225 1240 1243 1258 
1266-1267 1274 1277 1280 1282- 
1285 1299 1310 1317-1322 1324- 
1325 1329-1330 1342 1344 1346 
1349-1351 1354-1357 1365-1366 
1369 1371 1373 1376 137S 1380 
1383-1384 1387 1399-1400 1405 
1410 1427 1429 1431 1433-1435 
1439-1441 1448-1449 1454 1457 
1468 1470 1472 1475 1480-1481 
1487 1490-1491 1493 1498 1509 
1512 1521 1525-1526 1529 1535- 
1536 1547 1549 1557-1559 1588 
1592 1595 1597-1598 1601 1603- 
1604 1608 1611 1614 1618 1624- 
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Tissue origin I RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1626 
1644 
1665 
1702- 
1724 
1742 
176S 
178S 



1632 
1646 
1668 
1703 
1727 
1747 
1772 



1634 

1654- 

1675 

1709- 

1731- 

1749 

1776- 



1636 
1657 
1685 
1710 
1732 
1755 
1777 



1641 
1660- 
1687- 
1716 
1737- 
1760- 
1779- 



1643- 

1662 

1689 

1719 

1740 

1761 

1780 



fetal skin 



FSK002 



Invitrogen 



13 286 302 307 313 321 330 335 
339 341 354 370 372 385 400 402 
408 414 426-427 433 436 4S0 454 
515 544 585 598 767 810 845 939 
1076 1109 1155 1317-1320 1326 
1333-1335 1343 1347 1350 1369- 
I37i 1377-1378 1391 1397 1422 
1466 1647 1656 1678-1679 1687- 
1688 1693 1718 1721 1725 1731- 
1732 1739 1755 



110 137 211 353 589 927 1108 
1639 1771 



fetal spleen 



BioChain 



FSP001 



umbilical c or d" 



BioChain 



FUC001 



4-8 10 12 14 17 33-36 44-46 57 
64 68-69 75 82 85 101 104 113- 
114 116 119 122-124 133 137 153- 
154 157 161 163 166-167 175 181- 
184 186 192 197-198 200-202 212- 
215 230 234 246-247 251 256 263 
267 271-272 280-281 284 295 301 
314 317 321 326 333-335 345 351 
356 368 371-373 379-380 386 390 
392 394 406 408-410 412 414 416 
420 424 427 430-436 438 444-446 
454 459 461 463 467 473 482-483 
486 488 490 495 504 509 524 526 
537-540 547 555 561 574-577 588- 
591 593 606 615 620-621 632 637 
645-647 650 659-660 662-664 667- 
668 674-675 684 687 696 698 701 
703-705 709 711 714 719-720 725- 
727 732 749-750 762 765 771 775- 
777 780 789-791 793 796 802-803 
814-817 822 833 843 845 848 858 
861 864 875 879 888 894-895 897- 
900 903 906-907 911-912 925 930- 
933 936 940 948 953 960 966 977 
984 990 992 998 1000-1001 1005- 
1O07 1016 1023 1025 1037 1046- 
1047 1059 1061-1063 1073 1076- 
1077 1089 1094-1097 1112-1113 
1115 1134 1144-1148 1151 1154 
1156 1163 1171 1197 1204-1205 
1208 1216 1218 1224 1234-1235 
1243-1244 1246 1279 1283 1286- 
1287 1298 1316 1320 1344 1346 
1350 1357 1359 1371 1373 1375 
1381 1398 1400 1403 1408 1414 
1424 1427-1428 1431 1433 1440- 
1442 1446 1454-1455 1479 1482 
1484-1485 1489 1492-1493 1504- 
1505 1513 1525 1527 1536 1538 
1546 1565 1567 1571 1573 1575- 
1576 1578-1579 1591 1595 1600- 
1601 1608 1612 1615 1621 1624 
1626 1636-1637 1647-1648 1651 
1653 1656 1658 1661-1662 1672 
1675 1682 1684 1686-1688 1690 
1709-1710 1722 1727 1729 1735- 
1738 1740-1741 1760-1761 1768 



fetal brain 



GIBCO 



HFB001 



4 9 11-13 17-18 22-23 25 37-39 
42-47 50-51 54-55 58 60-61 65-66 
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SEQ 


ID NOS: 












72 75 77 80 


82 


85 90-91 


94 100- 








102 107 


110 


112 


-116 


118- 


119 


122- 








123 126 


128 


134 


136 


-140 


147-148 








153-155 


157 


161 


165 


169- 


172 


175 








181 186 


188- 


189 


197 


-198 


204- 


•206 








208 210 


215 


222 


-223 


225- 


226 


23 0 








235-238 


240- 


-241 


247 


253 


256- 


-258 








260-262 


267- 


-269 


276 


279- 


281 


284 








286 289 


298 


300 


-302 


307 


310 


318 








321-323 


325 


330 


-331 


339 


341 


346- 








349 352 


354 


356 


-359 


362 


364- 


365 








371-372 


377 


379 


-380 


3 82 


384 


387 








390 400 


408 


414 


-416 


419 


424 


431 








434-435 


438 


441 


-443 


449 


451 


453- 








455 457- 


-463 


470 


472 


-473 


475 


477- 








478 482- 


-483 


486 


-488 


490- 


491 


493 








496 499- 


-500 


502 


-504 


506- 


507 


509- 








512 515 


519- 


■520 


522 


525- 


526 


529- 








530 537- 


-540 


543 


-544 


546- 


547 


566- 








567 569-570 


572 


-582 


585 


588 


590- 








591 593 


595 


599 


601 


504 


606- 


■609 








611-612 


614- 


620 


622 


-624 


630 


632 








636 643 


645- 


-647 


650 


-652 


654 


659 








661 665 


667- 


668 


670 


-672 


676 


678 








681 687 


689 


692 


-694 


697 


699 


710 








714 717 


721 


727 


729 


-732 


734 


736 








738 743-746 


750 


-751 


759 


763 


766 








770 772 


775- 


-777 


7B4 


7B9 


791 


796 








799 802- 


-805 


810 


-811 


814 


819- 


-821 








824 826 


830 


834 


-837 


839- 


850 


854- 








856 858- 


-860 


862 


864 


869 


871 


876- 








877 879 


883 


886 


-887 


890- 


891 


893- 








895 89B- 


-901 


905 


908 


-910 


912- 


-916 








919 922-923 


925 


927 


930- 


933 


935- 








938 948 


952- 


-960 


963 


-964 


967 


969- 








972 975 


978- 


979 


981 


983 


986- 


•987 








990 992 


995 


997 


999-1002 


1005- 








1005 1011-1013 


1016 


1018 


-1019 








1023 1026 1029- 


1031 


1033 


-1035 








1038 1041 1047 


1050 


1053 


1057 








1059 1064 1068 


1070 


1072 


-1073 








1078-1079 1081- 


1082 


1086 


1089 








1094 1097 1103 


1107 


-1109 


1113- 








1115 1121-1122 


1127 


1134 


-1135 








1138 1140 1143 


1148- 


-1151 


1153 








1156-1157 1159 


1167 


1170 


117S 








1193-1194 1200 


1202 


1207 


-1209 








1211 121G 1219- 


1220 


1226 


-1227 








1229 1232-1234 


1240-1241 


1243 








1246 1249-1251 


1253 


-1254 


*1258 








1267-1268 1271 


1276 


1279 


1282 








1285-1289 1293- 


1294 


1305 


1307- 








1308 1312 1316 


1320 


1327 


133 8- 








1339 13' 


Ll-1344 


1346 


1349 


1355- 








1357 1359 1365- 


1366 


1369 


-1370 








1373-1375 13 


79 


1386 


1389 


1394 








1398 1409 1413- 


1414 


1416 


-1417 








1420-1421 1425- 


1427 


1430 


1433 








1437 1439 1442 


1445-1452 


1454- 








1457 1459 1463- 


1464 


1468 


1470 








1474 1477-1479 


1489 


1492 


1494 








1497-1498 1501- 


1503 


1507 


1509 








1511-1513 1517 


1520- 


-1521 


1524- 








1526 1531-1533 


1535 


1S37 


-1538 








1547 1554 1556- 


1559 


1564 


-1567 








1571 1584 1587 


1589 


1594 


1599- 








1601 1611-1612 


1614- 


1616 


1619- 








1620 1625-1628 


1630- 


-1631 


163 


4 








1637-1638 1640- 


1643 


1645 


1648- 



129 



WO 01/53312 



PCT/US00/34263 
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Hyseq 
Library Name 



SEQ ID NOS: 



1649 
1664 
1679 
1704 
1720 
1737 
1755 
1779 



issr 

•1665 
1683- 

■1705 
1724 

•1738 
1757 
1785 



1653- 
1667 
1634 
1709 
1727- 
1743- 
1760- 



1655 

1669 

1686 

1713- 

1728 

1744 

1761 



1657-16S8 
1673 1678- 
1693 1701 
1714 1717- 
1731-1733 
1752 1754- 
1765 1772 



macrophage 
infant brain 



Invitrogen 



Columbia 
University 



HMP001 
IB2002 



5-8 110 204-205 503 634 678 859 
87R 933 988-989 1379 1448 1504 



10 12-13 15-18 22-23 25 29 34 

37-39 43 47 50-51 54-56 58 60-63 
65-66 68-69 72-74 80 82-83 86 
88-92 97 100 102-104 106-108 110 
112-113 115-116 118 123 128 130 
134-136 138-139 143 147-149 151- 
152 154-155 163 165-167 169 172- 
175 181-184 186 193-196 198 201 
203-205 209-210 214-215 222 224- 
226 231-232 235-236 239 246-247 
252 257 260 268-269 272 276-277 
279-281 286 288 291-292 295 298 
300-301 304 307 310 313 321-323 
330-331 333-334 339 346-347 349 
352 356-357 362 371-372 377 379- 
380 383-384 392 397 401 406 408 
411 413-414 416 418-415 422 428 
430-431 434-435 438 443 449 453- 
454 461 464-466 469-470 472-473 
475-476 478 482-4B3 487 490 492 
494 497 503 507-508 510-513 516 
519-520 524-526 530-534 536-540 
547 550-551 561 563-564 566-567 
572-576 579 581-582 584-S07 590- 
591 593 595-597 607-609 611-613 
616-617 620 622-624 627 631 637 
641 64S-647 650-655 657-658 660- 
665 667-675 689 691 695 697 699 
703 707 713-715 717 721 728-731 
733-736 739 743 745 751 755 759 
763 769-770 772 778 780-781 785 
788-789 793-794 799 603 BOB 811 
814 825-826 830 834-836 840-843 
845 848-850 854-855 860 862 864- 
865 870 872 875-876 878 886 888 
890-891 894-896 898 903-904 916- 
917 319 922-925 927-928 930-932 
934-936 938 941 945-946 948-950 
953-954 959-962 966-969 977 979 
981 986-990 992 997 999-1000 
1004-1006 1014 1016 1018-1019 
1024-1025 1033 1036 1047 1051- 
1052 1054-1055 1057-1059 1063- 
1064 1068-1070 1073 1081-10S2 
1085 1089 1108-1113 1118-1120 
1123-1124 1130 1132-1138 1140 
1149 1151 1153-1154 1163-1170 
1172 1174-1175 1183-1184 1Z88 
1190 1193-1194 1196-1197 1199 
1204 1208-1209 1211 1218-1222 
1226-1227 1229 1231 1234 1241 
1247 1249 1251 1256 1258 1261- 
1262 1269 1274 1279 1281 1283 
1285 1287-1289 1294-1295 1305 
1307 1313-1314 1316-1320 1329 
1332 1341-1342 1345 1349 1356 
1362-1363 1365-1366 1368-1370 
1374 1381 1383-1384 1388 1400 
1403 1406-1407 1413 1417 1420 
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PCT/LS00/34263 



Tissue Origin" 



RNA Source 



infant brain 



Hyseq 
Library Name 



Columbia 
University 



infant brain" 



Columbia 
University 



IB2003 



SEQ ID NOS: 



1423 1429-1431 1435-1436 1439- 
1441 1443 1447-1449 1451-1452 
14S4-1455 1457 1459 1463-1465 
1468 1470-1471 1475 1479 1482- 
1483 1485 1493-1494 1496 1430- 
1499 1502-1503 1505-1507 1509 
1522-1523 1525 1528 1532^1533 
1542 1546-1547 1549-1550 1554- 
1555 1563 1565-1567 1569 1575 
1580 1583-1586 158S 1590 1592- 
1593 1595 1598 1600-1601 1608- 
1610 1612 1614-1616 1619 1621 
1624 1626-1627 1630-1633 1637 
1639-1640 1642 1644 1647 1652 
1654-1655 1658-1659 1664-1665 
1672-1673 1676-1681 1685-1688 
1693-1695 1701-1702 1704 1708 
1717-1720 1723-1724 1726-1728 
1733 1735-1741 1743-1744 1752 
1755-1758 1762 1765 1771 1774 
1777-1778 1786 



17-18 20-23 29 34 43 60 €8-69 

78-80 88 100-101 107 110 112 128 
123 128 133 135-137 146 148 152 
159 166 169 174 194 198 203 215 
223 225-226 229 235-236 247 2S0 
276-281 286 290-292 295 300-3Q1 
310 322 324 331 334 339 346-347 
349-350 352 357 371 376-377 392 
384 403 408-409 414-415 453-455 
472 476 478-479 490 503 S07 516 
520 530 534 S36-540 551 563 572- 
576 585 587 590-591 593 595-596 
601 606 612 616-617 620 622-624 
650 652-653 661 665 670-671 674- 
675 678 60S 725 717 727-728 730 
734 759 775-777 780-781 785 796 
806-807 811 824 845-846 864 869 
875 882 8B9 894-895 898 904 917 
919 921-923 932 935-936 946 950 
954 962 977 979 997 999-1000 
1005-1006 1009 1011 1017 1024 
1033 1037 1043 2055 2057 1109 
1114-1115 1120 1123 1127 1144- 
1145 1249 1151-1153 1160 1267 
2170 1174 1193-1194 1196 1199 
1202 1206 1209 1220-1221 1226 
1229 1240-1241 1252 1258 1284 
128B-1289 1305 1314 2327 1333 
1344 1347 1350 1356-13S7 1365 
1366 1378-1379 1388 1400 1403 
1421 1423 1431 1436 1440-1441 
1446-1447 1457 1459 1471 1499 
1503 1507 2509 153S 1546 1557 
1559 1567 1572 1587 1595 1598 
1610-1612 1615 1631 1639 1644 
1647 1657-1658 1673 1678-1681 
1683-1684 1701-2702 1708-17OS 
1713-1714 1719 1757 1760-1762 
1765 1771 1778 



IBM002" [ 101 113 139 152 260 279 290-292 " 

374 377 551 563 608-609 653 659 
814 954 1005-1006 1029-1030 1130 
1164 1209 1258 1294 1305 1320 
1327 1397 1431 1498 1507 1615 
1640 1694-1695 1763-1764 1767 
1779 



Columbia 
University 



IBS001 



10 12 119 175 279-281 321 334 
371 446 551 563 623 6S2 667 669 
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Tissue Origin 



RNA Source 


Hyseq 






SBQ ID NOS: 








Library Name 






















671- 


£72 


819 


949 


96*6 


1113 


1130 






1151 


1188 1193-1194 


1196 


122 


9 






1258 


1265 1271 1287 


1317 


-1319 






1324 


-1325 1342 1423 


1440 


-144 


1 






1448 


147 


1 1482 1 


525 


1532 


1546 






1562 


1569 1588 1591 


1610 


1618 






1647 


164 


9 1658 










Strategene 


LFB001 


S-9 


17 20-21 


25 


6B-69 82 


94 


105 




153 


157 


197- 


198 


203 


207- 


208 


212- 






213 


223 


262 


266 


233 


302 


321 


326 






333 


356 


370 


427 


430 


436 


446 


462 






472 


493 


498 


503 


516 


519 


527 


535 






537- 


540 


542- 


544 


562 


565 


567 


586 






599- 


600 


607 


615 


630 


647 


662- 


664 






692- 


694 


712 


719 


745 


748 


775- 


777 






794- 


796 


810 


837 


843- 


847 


849 


854- 






S56 


869 


876 


903 


934 


953 


955- 


956 






964 


975- 


•976 


934 


1000 100 


5-1007 






1024 


-1025 1033 1039 


1053 


1064 






1070 


1072 1082 1112- 


-1113 


1134 






1136 


-1138 1140 119S 


1223 


1232- 






1233 


1246 1279 1285 


1295 


1311 






1320 


1334-13 


35 1343 


1427 


-1428 






1446 


1478 1482 1493 


1504 


1537 






1552 


1555 1S67 1575 


1582 


1598 






1620 


1625 1632 1638 


1645 


1654- 






1655 


1662 1680-1681 


1684 


1686 






1690 


1696 1702 1711 


1733 


1741 






1760 


-1761 1778 1785 








Invitrogen 


LGT002 


5-10 


18 


20-21 29 33- 


-36 40 43 52 




54-55 61 65- 


-66 68-70 73- 


75 80 85 






88-89 93-94 


100 


103 


106- 


108 


112- 






113 


115-116 


118 


-119 


123- 


124 


126 






130- 


132 


135- 


•137 


139 


-141 


143-144 






147- 


148 


151- 


•153 


155 


-156 


159 


161 






164 


169 


171 


179 


-180 


185 


190 


192 






194 


196- 


-199 


203 


-208 


210 


212- 


-214 






216- 


217 


219 


222 


233 


240- 


241 


244 






246 


251- 


-252 


255 


-256 


261- 


262 


266. 






272 


276-277 


279 


-281 


234 


286 


288 






290 


295 


298 


301 


-302 


309- 


312 


317 






321 


329 


332 


341 


-342 


344- 


345 


348 






352 


358- 


-360 


363 


368 


370- 


371 


376 






380- 


381 


384 


389 


-390 


398 


400 


409 






414 


423 


426-427 


430 


432- 


436 


443- 






444 


450- 


-451 


454 


462 


468 


472 


-477 






480- 


483 


487 


-468 


490 


-491 


493 


496- 






498 


500 


503-506 


509 


-512 


515 


-516 






519 


521 


-523 


526 


530 


534 


541 


544 






547 


554 


557 


564 


566 


-567 


572- 


-576 






585-586 


588-589 


595 


-596 


601 


607 






611- 


612 


615 


619 


621 


623 


626 


630 






632- 


633 


644 


647 


649 


651 


655 


-656 






660 


662 


-665 


667 


669 


672 


683 


-684 






696 


700 


705 


710 


713 


716 


718 


-719 






722-723 


728 


734 


-739 


743 


750 


752 






763 


765 


-765 


773 


-778 


784- 


-785 


787- 






789 


791 


800 


802 


-803 


809- 


-812 


814 






824 


826 


628 


-829 


832 


838- 


839 


841- 






845 


849 


-8S0 


852 


-855 


857- 


-851 


864 






866 


874 


878 


-880 


882 


8B7 


890 


-891 






897-898 


902 


904 


906 


-907 


910 


916 






918- 


920 


922 


924 


-925 


927 


930 


-932 






934- 


■935 


937 


947 


950 


953 


955 


-956 






961 


963 


966 


-967 


969 


971 


977 


-979 






981 


984 


986 


-987 


990 


992- 


-993 


995 






997 


999 


-1001 1005-1007 1009 








1012-1013 1018 


1020 


1022-1024 






1026 1029-1030 


1033 


1038 1041 



lung, 
fibroblast 



lung tumor 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SBQ ID NOS: 



lymphocytes 



atcc 



LPC001 



104S 
1059 
1074 
1097 
1116 
1139 
1152 
1172 
1202 
1222 
1257 
1278 
1289 
1317- 
1344- 
1357 
1383- 
1403 
1431 
1448 
1470 
1488 
1508- 
1519 
1540 
1561 
1591 
1602 
1624- 
1644- 
1656- 
1671 
1685- 
1705 
1730 
1748- 
1767 
1778- 



1047 
1063 
1078 
1104 
-1117 
1141 
-1153 
1178 
12 04 
1227 
•1258 
1280 
1295 
-1321 
-1346 
1365 
1385 
1408 
1433- 
14S4- 
1474 
1490- 
1509 
1523- 
1546 
1S65 
1593- 
1608 
1625 
1645 
1662 
1673- 
1688 
1709 
1735 
1749 
1770- 
1779 



-1050 
-1064 
1085 
1106 
1119 
-1142 
1156- 
1195 
1203 
1234 
1265 
-1281 
1300 
1329 
1349- 
-1366 
1394 
1417 
-1436 
-.1455 
1480- 
■1491 
1511- 
•1524 
1549- 
1567 
1594 
1614- 
1627- 
1647- 
1664 
1675 
1690- 
1716- 
1739 
1753 
1771 
1786 



1052 
1067 
1087 
1107 
1126 
1144 
1158 
-1196 
1214 
1241 
1267 
1283 
1305 
133 8 
1351 
1369 
1397 
1419 
1438 
1460 
1481 
1494- 
1512 
1528- 
1550 
1569 
1596- 
1616 
1632 
1649 
1666^ 
1678- 
1692 
1717 
1741 
1760- 
1773 



1054 
1071 
1089 
1109 
1134 
-1145 
1167 
1198 
1216 
1247 
•1270 
1285 
1308 
•1339 
1353 
1378 
1400 
1423 
1444 
1466 
1483 
1496 
1515- 
1529 
1555 
1575 
1598 
1618 
1636 
1652- 
16S7 
1679 
1696- 
1722 
1743- 
1762 
1775 



1055 
1073- 
1095- 
1112 
-1135 
1148 
1170 
-1200 
1219 
1252 
1276 
1288- 
1312 
1341 
1355 
1379 
1402- 
■1426 
1446- 
1468 
1486- 
1506 
1516 
1536- 
1560- 
1588 
1600- 
1620 
1639 
1653 
1670- 
1683 
1699 
1727 
1744 
1765 
1776 



4 11-12 18 24-25 30-31 4B 50-51 
56-57 68-69 80 92 98 103 105 110 
126 137 152-153 157 165 172 188- 
189 197 203 210 217-218 222-223 
225-226 229 231 247 251 256 264 
272 280-281 284 300-301 321 325- 
326 339 348 352 357 371 382 384 
390 400 404 412 414 421 423 426- 
427 430-431 445 447-448 451 454- 
455 475 503 516 525-527 530 537- 
540 549 556-560 563 574 577 589 
602 613 615-617 621 623 628-630 
636-637 647 649 657-659 690 697 
717 723 755 764 775-777 780 786 
789-790 793 800 802 822 838 S49 
866 869 876 881-833 852 898 906- 
907 911 921-923 928 975 990 992 
996 1001 1004-1007 1033 1050 
1054 1078 1107 1135 1140-1141 
1143 1148 1158 1163 1177 1199 
1205 1216 1226 1231 1236 1241 
1244 1250 1258 1260 1265 1269- 
1271 1290-1293 1308 1312 1317 
1319-1320 1339 1345-1346 1348 
1350-1351 1357 1367 1369 1379 
1381 1383-1384 1386-1387 1389 
1394 1397 1405 1423 1425-1428 
1431 1437 1446 1448 1461 1466 
1470 1472 1474 1482 1492 1506 
1528 1537 1546 1549 1591 1598 
1600 1603-1604 1606 1627 1636 
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Tissue Origin 


RNA Source 


Hyseq 






SEQ ID KOS: 






Library Name 




















1638 


164 


7-1649 1651 


1659-165 


9 








1664 


1676-1677 1680- 


1681 1687- 








1688 


1S99 1711 1715- 


1716 1726 








1728 


173 


7 1740 1746 


1748 1752 








1756 


1758 1777 1 


773 






leukocyte 


GIBCO 


IAJC001 


3-4 


10-11 13 15- 


18 2 


0-21 24- 
48 50-5 


25 






30-3 


1 35 


-36 40 43-45 


1 








54-5 


8 60 


-63 68-69 75 


79-80 82-83 








85 88-91 


93-96 98 10 


0 103-104 








107- 


108 


112 116 


119 


123 125- 


128 








134- 


140 


142 147- 


-149 


151 153 


155 








157 


162- 


163 167 


169- 


172 174 


177- 








179 


186 


190 192- 


■199 


203-207 


210 








212- 


215 


217-219 


222- 


223 229 


235- 








236 


247 


251 255- 


-258 


260 262 


272 








274- 


277 


280-281 


285- 


286 297- 


301 








307- 


310 


313-314 


316- 


317 321 


325- 








330 


333- 


334 340- 


-342 


348-349 


352 








354- 


358 


370-371 


380- 


385 387- 


388 








400 


405 


408-410 


412 


414-416 


421- 








425 


430- 


431 434- 


-435 


437 439 


441- 








442 


445- 


451 453- 


-454 


456 459 


461- 








464 


468- 


472 474-479 


481 483- 


■485 








487- 


491 


496 499- 


-501 


503-504 


509- 








513 


516- 


-519 522 


526- 


527 529- 


-531 








534 


536- 


-540 542 


547- 


549 553- 


-559 








566- 


•567 


571 574 


-577 


579 582 


584- 








586 


589 


593 595 


-597 


601-602 


604 








606- 


-607 


611-613 


615- 


621 623 


627- 








629 


633 


636-637 


642 


644-650 


655 








659- 


-660 


662-665 


667 


669 674-675 








678 


682- 


-684 692 


-696 


698 700 


706 








708 


710 


716-720 


725- 


-726 729 


-736 








738-739 


743-746 


749 


751 753 


756 








759 


765 


-766 768 


770-773 780 


784- 








786 


788- 


-790 793 


796 


793 800 


802- 








803 


810 


-811 814 


817 


819 826 


828- 








830 


832 


834-836 


838 


843 845 


-860 








863 


-864 


866-871 


877 


-879 881 


-892 








894 


-896 


898 902 


904 


-914 916 


919- 








925 


927 


930-932 


935 


-936 941 


-942 








945 


948 


-949 953 


955 


-956 958 


960- 








962 


964 


957 970 


-971 


973 975 


977 








985 


-990 


992-993 


995 


-996 999 


-1002 








1004-1009 1011 


1014 


1017-1019 








1022-1023 1025 


1027 


1029-1031 








1033-1036 1038 


1041 


1043 1047 








1050 10 


53-1054 


1058 


-1059 1061- 








1062 1064 1068 


1070 


1072 1078 








1085-10 


86 1089- 


1091 


1093 1097 








1106-1107 1110- 


1113 


1115-1117 








1122-1123 1125 


1129 


1132-1133 








1135-1137 1140- 


1145 


1152 1158 








1163 11 


68 1170- 


1174 


1176-1178 








1180 11 


82-1183 


1186 


1195 1198- 








1200 1202 1205- 


1206 


1211 1216 








1219-1221 1223- 


1227 


1230-1236 








1238-1242 1247 


1252 


1254 1256 








1258 1261-1262 


1264 


-1265 1269- 








1270 12 


72-1275 


1277 


1280-1284 








1287-12 


93 1299- 


1300 


1306 1308 








1312-13 


13 1317- 


1320 


1322 1324- 








133 


0 13 


33-1335 


1339 


1341 1343- 








134 


7 13 


49 1353- 


13 57 


1359-1361 








1365-13 


67 1369- 


1370 


1373-1374 








1377 13 


79-1381 


1386 


-1387 13 


94 








1400 14 


03 1409 


1419 


1423 1423- 








142 


8 14 


30-1431 


1433 


-1434 1437- 








143 


8 1440-1442 


1446 


-1448 1450 
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Tissue Origin 



leukocyte 



melanoma from 
cell line ATCC 
ftCRL 1424 



RNA. Source 



Hyseq 
Library Name 



SEQ ID NOS: 



Clontech 



Clontecn 



mammary 



gland 



LUC003 



MEL0O4 



Invitrogen 



MMG001 



1453 1458-1459 1463-1164 1468 
1470-1471 1474 1477-1478 1482- 
1488 1490-1493 1496-lSOl 1504 
1506 1509 1512-1513 1516 1519 
1521-1522 1524-1525 1527-1528 
1531 1534 1538 1541 1S45-1S47 
1549-1550 1553 1555-1556 1560 
1565 1567 1575 1580 1589 1591 
1594 1596 1598 1600-1602 1606- 
1608 1611 1614 1620-1621 1624 
1626-1629 1631.-1632 1636 1638* 
1639 1641 1644-1645 1648-1650 
1653-1655 1658-1660 1662 1669- 
1670 1675-1679 1684-1688 1690- 
1692 1696 1700 1702 1707-1709 
1711 1716-1717 1720 1723 1725- 
1727 1733 1737-1738 1741 1743- 
1744 1748-1749 1752 1755 1760- 
1762 1765 1769 1771-1772 1781- 

1784 1786 

4 35-36 44-4^ 61 68-69 75 B2 ma 



119 139 154 179 197 244 280-281 
324 372 404 430-431 455 461 476- 
477 481 503 537-540 554 575-576 
581 589 608-609 621-622 624 630 
632 647 662-664 669 675 698 764 
773 775-777 802 848 851 856-857 
879 905-907 915 949 952 990 992 
1002 1113 1119 1170 1183 1216 
1236-1237 1241 1275 1346 1353 
1357 1359 1377 1506 1515 1534 
1553 1591 1600 1613-1614 1621 
1628 1670 1676-1677 1691-1692 

1699 1733 1738 1772 

25 35-36 43 80 104 126 12S 150 



163 166 188-189 197 210 215 220 
271 277 280-281 310 317 336-338 
34S 351 372 380-381 383 387 412 
415-416 430 445 448 454 456 467 
481 490 499 503 526 528 546 548 
567 575-576 588 601 613 615 647 
660 665 734-735 737 753 778 787 
790 800 832 845 856 859 869 878 
883 887 905 914 932 934 958 976 
985 990 992 999-1000 1025 1031 
1038 1050 1055 1068 1074 1088 
1099-1102 1107 1136-1138 1149 
1156 1163 1172 1190 1195 1200 
1214-1215 1217 1226-1227 1235 
1238-1239 1244 1253 1278*1230 
1293 1311 1320 1330 1334-1335 
1345 1355 1367 1386-1387 1394 
1403 1406 1414 1423 1437 1442 
1465 1521 1529 1536 1539 1541 
1547-1548 1582 1620 1626 1631 
1638 1647 1653 1660 1667 1669- 
1670 1680-1681 1696 1704 1715 
1724-1725 1731-1732 1750 1760- 
1761 



5-8 10 12 14-18 20-21 24-25 23 
33-39 42-43 52 55-58 60-64 68-69 
71 73-74 79-80 82 89 98 100 103 
106 108 112 123 128 133-137 144- 
146 148 150-152 154 158-159 165- 
166 170-172 174 176 178 181-185 
188-190 194-198 201-206 210 217- 
222 224 227-228 231 233-237 247 
251 253-254 2S6 261-263 266-267 
271 276-277 279-281 284-286 288 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



290 297 299 301 
320-321 323-325 
334 339 341 344 
359-360 362-363 
383 380 390 393 
4C8 412 414-415 
441-444 448 451 
476 479 482 485 
4S5 498 503 506 
519-520 522 527 
547 549 554 557 
589-591 597 602 
629 632 634-640 
652 655 S57-658 
672 674-676 679 
706-707 710 713 
732-734 736 738 
755 759 761 766 
789 794 803 806 
822 827-829 837 
864 866 869-870 
893-900 904 906 
921-923 926 935 
953-954 957 960 
970 977-978 984 
1000-1001 1005 
1014 1016-1017 
1032-1033 1036 
1055 1057-1058 
1077-1078 1085 
1095-11Q2 1107- 
1121-1123 1131- 
1139-1142 1144- 
1153 1159 1167 
1183-1185 1190- 
1207-1208 1212 
1223 1225 1231 
1247 1253-1254 
1262 1270-1200 
1298 1307 1314 
1325 1330 1334- 
1349-1352 1354- 
1370 1377 1379 
1389 1405 1414 
1425-1426 1428- 
1437 1439 144S- 
1460-1464 1466 
1487 1489-1491 
1512 1519 1526- 
1536 1539 1542 
1554 1561-1562 
1576-1579 1581- 
1592 1594 1596- 
1607-1608 1610 
1621-1622 1625- 
1636 1641 1643- 
1652 1654-1655 
1662 1664-1666 
1674 1676-1677 
1692 1701 1706 
1720 1723-1728 
1740 1742-1744 
1751 1753 1760- 
1771 1774 1776- 
1784 1786 



304 309-312 318 
327-329 331-332 
345 348 350 356 
368 371 376 379- 
395 397-398 405 
423 430 434-437 
455 462-464 474 
-486 488 490 494- 
509-512 516-517 
529 534 537-541 
562 572-574 587 
607 618 623 628- 
644 647-648 650- 
660 665 667 669- 
682 688 695-696 
717 720 722-730 
743 747-748 750 
770 7B0 784 706- 
-807 809 814 817- 
842 854-858 863- 
872 878 881 889 
-907 911 916 919 
-937 946 948-949 
-961 963 965-966 
-989 993-997 
10C6 1008 1013- 
1023 1025 1027 
1039 1043 1045 
1063 1068-1075 
1087 1089-1091 
1108 1112-1119 
1133 1136-1137 
1145 1148-1149 
1170 1172-1173 
1192 1196-1199 
1216-1218 1222- 
1234 1240-1241 
1258-1259 1261- 
1283 1285-1286 
1316-1320 1323- 
1335 1342-1345 
1355 1359 1369- 
1381 1383-1384 
1419 1421-1423 
1429 1431 1434- 
1449 1454 1457 
1471 14B0-1483 
1493 1505 1507 
1528 1532 1534 
1547 1549-1550 
1564 1567 1572 
1532 1587-1588 
1597 1601-1602 
1612-1616 1618 
1626 1631 1635- 
•1644 1647 1650 
1657-1658 1660 
1669-1671 1673- . 
1680-1685 1689- 
1713-1715 1719- 
1730-1732 1738 
1746-1747 1749 
1762 1765-1768 
-1777 1779 1783- 



induced neuron 
cells 



Strategene 



NTD001 



29 35-36 80 116 123 156 163 1S1 

214 230 280-281 284-285 307 321 

330 340 358 371 375 377 380 382 

422 424 492 497 532-533 542 546 
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u- issue Origin Prna Source 



Hyseq 
Library Name 



SEQ ID WOS: 



S49 566 "5 
734 775-7 
956 858 8 
1041-1043 
1194 120G 
1288-1289 
1343 1359 
1623 1645 



86 S95 6lT 
78 780 732 
75 936 953 
1055 1072 
1223 1246 
1291 1294 
1412 1423 
1684 1705 



645-647 654 
799 821 B26 
985 990 992 
1104 1193- 
1253 1274 
1311 1320 
1485 1620 
1715 1751 



retinoid acid 

induced 
neuronal cells 



Strategene 



neuronal cells 



Strategene 



WTR0 01 



5-8 78 268-269 277 383 431 506 ~ 
623 677 731 999-1000 1139 142S- 
1426 1547 



NTU001 



29 65-66 80 82 110 119 146 152 ~ 
166 174 181-185 198 227-228 253 
284 309 325 332 334 336-338 375 
391 393 406 414-416 454 465-466 
470 488 503 506 510-512 519 537- 
540 572-574 597 602 607 623 647 
661 700 702 716 743 771 792 858 
904 948 954 977 1000 1005-1006 
102S 1064 1068 1122 1148 1185 
1219 1226 1234 1246 1271 12B3 
1295-1296 1311 1317-1320 1329- 
1330 1350 1355 1365-1366 1378 
1383-1384 1400 1412 1445 1505 
1539 1547 1578 1647 1656 1683 
1690 1738 1749 1783-1784 



pituitary 
gland 



Clontech 



PIT004 



placenta 



Clontech 



PLA003 



311 314 379 408 419 430 454 10ST" 

1095-1096 1272-1273 1312 1320 

1378 1652 1671 1720 1725 1736 
1741 1755 



prostate 



Clontech 



5-8 124 208 277 370 843 906-907 
1280 1317-1319 1359 1609 1621 
1737 



PRT001 



rectum 



Invitrogen 



REC001 



9 46 57 71 107 147 171 177 197 

201 229 231 242-243 274 280-281 
307 310 317 330 358 373 382-383 
400 430 434-436 461-462 469 477 
489 497 500 505-506 513 521 526 
531-533 547 618 649 657-658 662- 
664 710 729 767 771 789 820 861 
871 874 890-891 905 938 945 963- 
964 988-989 1002 1025 1033 1045 
3061 1095-1096 1112 1125 1142 
1196 1198 1202 1232-1233 1241 
1258 1272-1273 1287 1295 1313 
1333 1341 1344 1349 1360 1362- 
1363 1367 1437 1442 1447 1475 
1478-1479 1482 1489 1513 1517 
1527 1531 1536 1598-1599 1628 
1636 1557 1680-1681 1687-1688 

1717 1738 1743-1744 

17-18 29 33 62-63 71 73-74 83 86 
113 126 146 153 158 167-169 195 
200 206 261 309 312 341 344 368 
373 388 395 408 414 420 430 441- 
442 446 448 464 468 483 £17 537- 
540 547 567 585 SS9 602 623 628- 
629 632 645-647 551 657-658 669 
717-719 721 725-726 738 748 750 
756 762-763 766 770 774 790 819 
825 643 849 851 881 903 909 948- 
949 360 9B6 996 1020 1023 1033- 
1034 1064 1067 1070 1075 1086 
11D8-1109 1113 1130 1139 1153 
1159 1172 1178 1185 1187-1189 
1205 1220 1225 1240 1244 1271 
1317-1320 1323 1334-1335 1350- 
1351 1355 1369 1373 1375 1425- 
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Tissue Origin 


RNA Source 


Hyseq 






SEQ 


ID NOS: 










Library >Jame 






















1426 


1436 


1439 


1469 


14 74 


1477 








1482 


1546 


1587- 


1588 


1592 


1596 








1610 


1622 


1627 


1644 


1653 


1662 








1665 


-1666 


1669 


1675- 


-1677 


1749 








1786 














salivary gland 


Clontech 


SAL001 


10 55 97 103 110 140 149 


152 158 






198 


217-218 242 


-243 


256 


301 


308 








312 


321 333 351 


354 


360 


410 


437 








448 


473 487 494 


496 


501 


S35 


555 








3D 3 


570 572-573 


590-591 


624 


636 








651 


759 762 764 


768 


771 


788 


800 








809 


826 848 865 


879 


906- 


907 


925 








933 


963 1016 1020 1025 1040 


1046 








1055 


1066 


1103 


1150 


1172 


1181 








1234 


1281 


-12B2 


1288 


-1289 


1298 








1315 


1320 


1333 


1336 


-1337 


1346 








1359 


1373 


1379 


1424 


1447 


1449 








1474 


1482 


1492 


1494 


1498 


1511 








1523 


-1524 


1537 


1554 


1596 


1626- 








1627 


1636 


1652- 


1655 


1658 


1665 








1671 


-16/2 


1691- 


1692 








salivary gland 


Clontech 


SALs03 


158 


326 1423 1463-1464 






skin 


ATCC 


SFB001 


1320 


1400 












fibroblast 




















skin 


ATCC 


SFB002 


262 


736 1025 1253 








fibroblast 




















skin 


ATCC 


SFB003 


709 


1119 1350 1631 1653 






fibroblast 




















small 


Clontech 


SIN001 


25 142 146-147 


151 155 198 203 


intestine 






244 


260 271 280 


-281 


286 


288 


298 








301- 


302 308 312 


334 


340 


371 


398 








408 


412 414 416 


423 


425- 


427 


430 








434- 


435 445 452 


454 


4 78 


503 


516 








519 


521 523 543 


547 


549 


555 


559 








563 


569-570 585 


592 


604 


611 


626 








628- 


629 632 650 


659 


681 


710 


714 








718 


750 764 780 


798 


829 


842 


857 








859 


866 8 


37 892 


894-895 


901 


904 








906- 


907 912 919 


935 


997- 


998 


1000 








1007 


-1008 


1026- 


1028 


1044 


1055 








1089 


1097 


1116- 


1117 


1131 


1148 








1169 


1199 


1219 


1234 


1247 


1264 








1279 


1316 


1320 


1326 


1341 


1343 








1349 


1351 


1374 


1387 


1398 


1400 








1403 


1407 


1423 


1428 


1468 


1498 








1501 


1521 


1550 


1556 


1585 


1597 








1636 


1638 


-1639 


1645 


1653 


1€£6 








1662 


1671 


1675 


1684 


1691 


-1692 








1704 


1711 


1717 


1719 


1722 


1725- 








1726 


1729 


1733- 


1734 


1743 


-1744 








1762 


1767 


1780 


1785 








skeletal 


Clontech 


SKM001 


18 20-21 82 84 


101 118 134 148 


muscle 






151 


153 166 225 


-226 


258 


274 


277 








289 


329 361 412 


414 


424 


440 


452 








459 


470 488 503 


-504 


537- 


540 


647 








660 


673-675 715 


773 


780 


786 


830 








905 


922 950 963 


982 


990 


992 


1020 








1047 


1063 


1115- 


1117 


1121 


1134 








1228 


1268 


1284 


1298 


1321 


1329 








1336 


-1337 


1343 


1409 


1413 


-1414 








1509 


159.9 


1624 


1644 


1653 


1712 


skeletal 


Clontech 


SkM002 


168 


1683 1712 










muscle 




















skeletal 


Clontech 


SKMS03 


235- 


236 1409 










muscle 




















skeletal 


Clontech 


SKMS04 


235- 


236 












muscle 




















spinal cord 


Clontech 


SPC001 


4 9 


11 17 


30-31 


35-36 43 


46 


60 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult spleen Clontech 



82 85 92 94 108 110 
167 198 204-205 210 
259 277 280-281 300 
317 372 379 387 392 
430 433 448 467 473 
509 513 519 524 526 
547 549 551 559 567 
607 S16-617 623 625 
652 657-658 670-671 
682 709 711 715 719 
749-750 753 775-777 
809 820 832 834-836 
855 858 861 864 871 
898 906-908 917 919 
944 970 985 990 992 
1039 1053 1059 1065 
1077 1082 1085 1097 
1116-1117 1128 1134 
1174 1192-1194 1215 
1243 1283 1294 1307 
1323 1327 1330 1350 
1356 1359 1268 1375 
1407 1423 1429 1437 
1454 1470 1482 1492 
1511 1529 1S38 1548- 
1571 1578 1598 1600 
1627 1630 1639 1646 
1670 1686 1696 1740 
1771 



116 139 157 
215 229 256 
-302 304 315 
419 426-427 
487 489 506 
537-540 543 
569-570 593 
637 649-650 
673 679 6ei- 
728-729 734 
781 789 791 
847-849 854- 
•872 875 884 
924 934 942 
■993 998 1013 
1072 1075 
1103 1109 
1151 1170 
1225 1241 
1312 1320 
1353-1354 
1400 1406- 
1443 1448 
1501 1508 
1549 1565 
1614 1625 
1651-1652 
1751 1755 



SPLcOl 



117 312 326 348 424 426-427 431 
845 866 1320 1330 1333 1344 
1355-1357 1371 1387 1397 1446 
1538 1579 1669 1686 1739 1767 



stomach 



CI on tech 



STO001 



thalamus 



Cl on tech 



10 1S-1S 61 68-69 100 117 149 
197 201 227-228 231 249 273 280- 
281 287 291-292 302 312 358 362 
426-427 430 445 462 475 479 535 
597 620 630 651 662-664 722 739 
780 782 785 846 919 960 964 966- 
967 976 1008 1012 1032 1042 1063 
1071 1135 1170 1208 1234-1235 
1259 1277 1280-1281 1322 1349 
1359 1369 1449 1468 1474 1478 
1487 1493 1498 1S57-15S9 1622 
1634 1651 1653 1729 



THA002 



thymus 



Clontech 



THM001 



11 25 85 87 112 137 146 180 
190 198 206 210 212-213 235-236 
239 261 268-269 279 290 301 325 
333-334 341 351 356 364-365 379 
388 393 396 419-420 441-442 458 
477 483 508 525 531 549 567 606 
60B-609 647 681 715 725-727 736 
774 782 784 794 827 883 890-891 
899-900 961 997 999-1001 1004 
1034 1055 1097 1129 1144-1145 
1150-1151 1157 1172-1173 1177 
1193-1194 1208 1220 1249 1280 
1305 1345 1355 1369 1434-1435 
1440-1441 1454 1496 1546 1549 
1562 1572 1578 1590 1594 1613- 
614 1640 1651-1652 1671 1687- 
1688 1703 1743-1744 1746-1747 
1753 



44-45 54 57-58 62-64 79 104 123 
126 134 153 193 212-213 218 242- 
243 258 274 277 279 297 301 307 
327 330 333 342 351 358 371 410 
430 445 465-466 468 471 483 487 
493 503 506 509 517 526 535 537- 
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Tissue Origin I RWA Source 



thymus 



Hyseq 
Library Name 



SEQ ID NOS: 



540 S4S 548 554 567 584 586 590- 
591 604 612 621 638-640 645-647 
649 656 660 6S5 670 698 710 720 
728 735 739 746 759 762 766-767 
775-777 780 784-785 800 802 809 
824 826 828 845 851 858-859 864 
866 870-871 878 884 887 892 899- 
900 927 930-931 967 983 986 990 
992 999 1014 1029-1030 1033 1059 
1066 1073 1103 1107 1113 1116- 
1117 1119 1140-1142 1158 1163 
1172 1177 1195 1206 1209 1213 
1216 1218-1219 1221-1222 1227 
1271 1277 1282 1320 1329 1349 
1367 1369 1383-1384 1417 1419 
1423 1425-1427 1448 1477 1488 
1493 1S36 1554 1620 1644 1646 
1549 1654-1655 1661-1662 1669- 
1670 1674 1676-1677 1685-1688 
1707 1711 1731-1732 1737 



Clontech 



THMc02 



5-9 15-21 25 33 35-3£ 43-45 48 

50-51 54-55 60 75 83 87 89 93 
98-100 102 105 112 117 135-137 
141 143 146 157 167 169 192 196 
211 217-219 222 224 229 233 235- 
236 240-241 244 251-252 256 261- 
262 268-269 286 288 290 295 297 
301-302 309-310 315-317 321 324 
327 334 342 350 352-353 360 370- 
373 382 384 400 403 410 414-416 
424 430-431 436 445 454-456 461 
464-467 470 472 474-476 483 488 
497 500 504 506 513 516 519-520 
524 526 530-531 534 537-540 549 
554-555 565-566 569-570 572-573 
575-577 586-587 595 603-604 606 
612 630-632 634 636 647 650 657- 
660 666-667 669 673-675 678 69B 
700 703 708 720 725-726 731 738- 
739 743-744 750-753 757 759 763- 
76S 767 772-779 787 789-790 798 
800 810 823 829 834-836 841 848 
854-856 859 861 B64 B70-871 881 
890-891 89B 908-909 913 928 933 
941 949 958 961 963 967 969 975 
981 986 988-990 992 999 1007- 
1008 1014 1016 1039 1041 1073- 
1074 1079 1089 1097 1109 1114- 
17 1122 1131 1140-1141 1144- 
1145 1163 1172 1175-1177 1186 
1196 1198 1206 1211 1216 1220 
1223 1227 1234-1243 1261-1262 
1267 1271 1280-1281 1284 1290 
1308 1317-1320 1322 1324-1325 
1327 1330 1334-1335 1339 1346 
1350-1351 1355 1357 1360 1370 
1374 1377-1379 1386 1389-1390 
1392 1397 1400 1402 1406-1407 
1417 1423 1425-1427 1440-1441 
1466 1474 1477 1483 1493 1498 
1504 1506 1525 1536 154S 1549 
1566 1594 1598-1600 1608 1611 
1614 1621 1623 1625 1632 1639 
1641 1644 1647 1649 1653-1656 
1658 1662-1663 1671 1673 1678- 
1681 1686-1688 1693 1705 1707 
1711 1717-1718 1726-1727 1731- 
1733 1737-1738 1743-1745 1758- 
1761 1771-1772 1779 1786 
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Tissue Origin 
thyroid gland 



RNA Source 



Hyaeq 
Library Name 



SEQ ID NOS: 



CI on tech 



THRO 01 



trachea 



Clontech 



THC001 



4 9-10 20-21 37-39 48 50-51 54- 
57 60-61 65-66 71 83 94-36 98- 
100 102 104 110 112 115-117 119 
123 127 133 136-137 140 149 152- 
153 155-1S8 163-164 168-169 171 
186 190-192 197 201-203 219-220 
229 233-237 246-247 253 256 258 
262 2S5-266 268-269 277 280-281 
284-286 288-283 298-299 302 309- 
311 317 321 326 332 335 341-342 
344 348 350 354 358-359 363 368 
371-373 3B2-383 385 394 398 400- 
401 411 414-415 421 424 430-431 
433-435 443-446 450-452 454-455 
458 472-474 476-478 482 484-485 
487-48B 490-494 496-497 500-501 
503-504 506 509-513 516-517 519 
524 526-527 529 535-540 547 549 
562 S64 569-570 575-576 588 594- 
595 601-602 604 606 610 612 615- 
617 619-623 628-630 634-635 642 
647 649-6S1 660 662-665 668 670 
681 690-694 696 698 700 709 721 
727-729 732 734 738 740-741 743 
745 750 759 761 763 765 770 773 
780 785 795-796 798 802 804 823- 
824 826 82B 833 838 841-845 847 
849 857-860 867 B74-875 878 880- 
881 887-888 890-892 894-895 B98 
908 910-911 913-914 922-923 926- 
927 929 932-934 937 939 941-942 
948 953 957 961 963-964 966 978- 
979 981-982 987 990 992 1001 
1004-1006 1010 1014 1020 1024 
1033 1038-1039 1044 1047 1050 
1052-1054 1056 1058 1068 1070- 
1071 1077-1079 1088 1094-1097 
1105-1106 1112-1113 1116-1117 
1124 1126 1128-1129 1131 1134 
1136-1137 1142-1143 1146-1147 
1149-1150 11S6 1161-1164 1167 
1170-1173 1177-1181 1190 1192 
1197 1200 1204 1208-1209 1214 
1217 1219 1222 1230 1232-1233 
1235 1241 1245 1247 1254 1257- 
1258 1260 1262 1271-1273 1283 
1286-1289 1299 1306 1314 1320 
1330-1332 1334-1335 1342 1345 
1349 1365-1367 1370-1372 1374 
1381 1394 1407 1419 1428*1436- 
1437 1440-1441 1443 1446-1449 
1454 1459 1461-1462 1468 1470- 
1471 1475 1477 1479 1482 1491 
1497-1498 1504-1505 1507 1513 
1522 1524-1526 1528 1531 1534 
1536-1537 1548 1550 1553 1555- 
1559 1562 1567 1578 1590-1591 
1597 1599-1601 1612 1614 1616 
1619-1620 1622 1624-1626 1628 
1631-1632 1634 1636 1639 1644- 
1645 1648 1651 1653-1656 1658 
1660 1662-1663 1667 1669 1671 
1675 1678-1681 1683-1686 1689 
1691-1692 1703 1709-1711 1717 
1724-1726 1729 1734 1737-1738 
1740 1743-1744 1749 1753 1759- 
1761 1770 1777 1786 



9 29-31 46 48 87 104 107 110 135 
158 222 262 266 286 301 318 331 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



352 372 377 384 414 424 445-446 
454 472 474 491 496 560 579 588 
593 597 607 612 626 681 702 719 
810 859 866 B78 894-895 912 916 
922 932 935 1046 1075 1080 1099- 
1102 1113 1208 1215 1232-1233 
1237 1281 1312 1385 1387 1405 
1414 1424 1430 1437 1447 1505 
1569 1579 1586 1600 1641 1653 
1667 1671 1676-1677 1683 1691- 
1692 1711 1717 1726 1772 



uterus 



Clontech 



UTR001 



17 19 25 41 46 57-58 61 89 104 
108 139 152 174 198 200-201 206 
263-265 274 290 387 408 420 438 
446 448 452 4 73 491 493 499 503 
506 513 519 522 526 530 542^543 
560 601 610 632 659 665 720 751 
773 780 833 845 857 872 877 912 
929 934 937 996 1009-1011 1018 
1050 1075 1107 1124 1170 1219 
1258 1279 1287 1310 1320 1323 
1343-1344 1375 1437 1451-1452 
1478 1481 1498 1519 1521 1536 
1552 1579 1597 1602 1606 1620 
1626-1627 1649 1652 1661 1670 
1719 1722-1723 



TRADOCS: 1416191.1 (%CQN0 1 ! . DOC) 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


"T SPfeCIES — 


UJB H UK J. PTl ON 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1 


Y41736 


Homo 
sapiens 


Human PR01114 protein 
sequence - 


1398 


100 


2 


X66656 


Homo 
sapiens 


Membrane -bound protein 
PR0943 . 


2389 


99 


4 

5 


AF113136 

AF017806 
X02761 


Homo sapiens 

Mus mus cuius 
Homo sapiens 


IL-1 receptor- associated- 
Xinase-M; IRAK-M 
Zn-15 transcription factor 
fibronectir. precursor 


3043 
6351 


100 
77 


B 


X02761 
X02761 


Homo sapiens 
Homo sapiens 


tibronectir. precursor 
fibronectin precursor 


10535 

8990 

12564 


98 
89 
99 


9 


AJ011679 


Homo sapiens 


Rab6 GTPase activating 
protein, GAPCenA 


5251 


99 


10 


W88501 


Homo sapiens 


Human stomach carcinoma clone 
HP104 15 -encoded protein. 


2381 


100 


11 


AP117754 


Homo sapiens 


thyroid hormone receptor- 
associated protein complex 
component TRAP240 


11336 


98 


12 


Z97530 


Homo sapiens 


dJ466N1.4 (novel protein 
similar to ANK3 Cankyrin 3, 
node of Ranvier (ankyrin 
G) ) ) 


896 


100 


13 


Y58620 


Homo sapiens 


Protein regulating gene 
expression PRGE-13 . 


1894 


98 


14 

16 


AF213457 
AF233453 


Homo 
sapiens 
Homo sapiens 


triggering receptor expressed 

on myeloid cells 2 

RACK- like protein PRKCBPl 


1238 


100 


17 
18 


AF201303 
AF064205 


Homo sapiens 
Homo sapiens 


dhfr oribeta- binding protein 
RIP60 

dynactin l piso isoform 


3124 
3130 


99 

98 """ 


19 


U0&059 


Saccharomyce 
s cerevieiae 


Yhrl2lwp 


6377 
174 


100 
26 


20 


AB032903 


Homo sapiens 


guanos ine monophosphate 
reductase isolog 


1801 


99 


21 


AB032903 


Homo sapiens 


guano sine monophosphate 
reductase isolog 


1485 


99 


22 


AF140507 


Homo sapiens 


Ca2+/caimodul in- dependent 
protein kinase kinase beta 


3083 


99 


23 


AF140S07 


Homo sapiens 


Ca2+/calmodul in- dependent 
protein kinase kinase beta 


2300 


99 


24 


AJ289131 


Homo sapiens 


chondroitin 4-o- 
sulfotransf erase 


2211 


99 


25 

26 
27 


U334S0 

Y44488 
U43 701 


Homo 
sapiens 
Homo sapiens 
Homo sapiens 


DNA- directed RNA polymerase 
I, largest subunit 
ACRP30R2 variant protein, 
ribosomal protein L23a 


8777 
1387 


98 
100 


28 
"29" 


U02032 
Y41324 


Homo sapiens 
Homo sapiens 


ribosomai protein L23a 
Human secreted protein 
encoded by gene 17 clone 
HNFIY77 . 


791 
767 
1083 


100 

97 

99 


30 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2 . 


715 


90 


31 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2. 


631 


82 


32 


AF231917 


Homo sapiens 


long- chain 2 -hydroxy acid 
oxidase HA0X2 


1811 


ioo 


33 
34 


Z29481 
AB001451 


Homo sapiens 
Homo sapiens 


3-hydroxyanthranilic acid 

di oxygenase 

Sck 


1507 


99 


35 


Y00644 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


2869 
1667 


100 
99 


36 
3V 


Y0O644 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


1104 


98 




Y787S5 


Homo sapiens 


Human antiauai-2 (AZ-2) amino 
acid sequence . 


3586 


78 


3 8 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence. 


4726 


99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

I DENT T TV 


39 


Y78795 


Homo sapiens 


Human antizuai-2 <AZ-2) amino 
acid sequence . 


3556 


77 


4 0 


U93121 


Homo sapiens 


M -phase phosphoprotein-1 


3747 


100 


41 


Y42750 


Homo sapiens 


Human calcium binding protein 
1 (CaBP-1). 


795 


100 


42 


AP282S26 


Homo sapiens 


latexin 


1189 


100 


43 


G02150 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6231. 


3 84 


94 


44 


U19617 


Mus musculus 


El£-l ■ - 


2724 


88 


45 


U19617 


Mus musculus 


Elf-l 


2062 


86 


46 


AF100758 


Homo sapiens 


osteoinductive factor OIF 


1538 


100 


47 


Y87591 


Homo sapiens 


Human SPROUTY-l protein, SEQ 
ID NO: 24. 


1737 


99 


49 


X04145 


Homo sapiens 


T3 gamma precursor (aa -22 to 
160) 


942 


99 


51 


X63547 


Homo sapiens 


oncogene 


5645 


99 


52 


M94043 


Rattus 
norvegicus 


rab-related GTP-binding 
protein 


1089 


96 


53 


L317B3 


Mua musculus 


uridine kinase 


917 


71 


54 


X83973 


Homo sapiens 


transcription factor 


4486 


98 


55 


AF224741 


Homo sapiens 


chloride channel protein 7 


4128 


99 . 


"56 


W74805 


Homo sapiens 


Human seereted nmfpin 
encoded by gene 77 clone 
HOEAS24 . 


1491 




57 


Z50907 


Homo sapiens 


Human TiBC-1 cDNA from second 
transcript . 


4824 


100 


58 


D7 9994 


Homo sapiens 


similar to anJcyrin of 
Chroma tium vinosuni. 


6039 


99 


59 


D79994 


Homo sapiens 


similar to anWr i r> /-» f 
Chroma tium vinosum. 




91 


60 


Y59738 


Homo s ap i ens 


Human normal ovarian tissue 
derived protein 15. 


601 


100 


61 


AB031069 




doma in 1 


1390 


100 


62 


Y66660 


Homo 
sapiens 


Membrane -bound protein 
PR0783 . 


2492 


99 


S3 


Y66660 


Homo 
sapiens 


Membrane -bound protein 
PR0783 . 


1709 


99 


64 


S70011 


Rattus sp. 


tricarboxylate carrier 


895 


55 


65 


AFl39518 


Rattus 
norvegicus 


A-kinase anchor protein 


178 


24 


66 


W29666 


Homo sapiens 


Homo sapiens DH1306 1 clone 
secreted protein. 


157 


3 0 


67 


AJ24573B 


Homo sapiens 


claudin-15 


1206 


100 


68 


AF 09913 8 


Rattus 
norvegicus 


GLUT4 vesicle protein 


4183 


87 


69 


AF099138 


Rattus 
norvegicus 


SLUT4 vesicle protein 


4906 


86 


70 


Z82059 


Caenorhabdit 
is elegans 


Similarity to Drosophila zing 
canal protein comes from 
this gene 


1285 


44 


71 


AF224278 


Homo sapiens 


PMEPAi protein 


1282 


100 


72 


AF126426 


Homo sapiens 


neurotrimin 


1809 


100 


'73 


Y41652 


Homo 
sapiens 


Human NSK2 protein sequence. 


2065 


99 


74 


Y41652 


Homo 
sapiens 


Human MEK2 protein sequence. 


1207 


100 


75 


AF188622 


Mus musculus 


selectively expressed in 
embryonic epithelia protein- 1 


1485 


74 


~76 


AE000406 


Escherichia 
coli 


putative DNA topoisomerase 


9S0 


100 


77 


X99302 


Homo sapiens 


Popl 


'655 


100 


78 


AL136538 


Schizosaccha 

romyces 

pombe 


similarity to S. cerevisiae 
ktii2 protein 


210 


31 


79 


AF129756 


Homo sapiens 


G4 


1554 


99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


-■ - ^ ■ 
IDENTITY 


80 


AL0967G8 


Homo sapiens 


dJ8 58B16.2 
(phosphatidyl serine 
decarboxylase (PSSC, EC 
4.1 .1, 65) ) 


2033 


100 


fl i 


AuU9b768 


Homo sapiens 


dJ858B16 .2 
(phosphatidylserine 
decarboxylase (PSSC, EC 
4.1 .1 .65) ) 


1220 


96 


82 


X57351 


Homo sap tens 


1-8D 


677 


98 


83 


AC005594 


Homo sapiens 


R26984_l 


2700 


98 


84 


X73113 


Homo sapiens 


fast fiyBP-C 


5959 


39 


85 


AF097330 


Homo sapiens 


HI chloride channel; p64Hl; 
CI.IC4 


1305 


99 


86 


ABO 1 3423 


Mus musculus 


SH2 domain- containing protein 


1360 


78 


87 


AF272151 


Homo sapiens 


adaptor protein CIKS 


3084 


99 


88 


AF196329 


Homo 
sapiens 


triggering receptor expressed 
on monocytes 1 


1214 


100 


89 


AB016879 


Arabidopsis 
thaliana 


contains similarity to pre- 
mRNA splicing 
factor~gene_id:MRB17 .2 


634 


36 


90 


AiJ133721 


Mus musculus 


homeodomain protein 




57 


91 


AJ242864 


Mus musculus 


phtf protein 


619 


61 


92 


A61971 


unidentified 


MCSP 


11676 


93 


93 


Y99365 


Homo sapiens 


Human PRO1250 (UNQ633) amino 
acid sequence SEQ ID NO: 85. 


3890 


100 


94 


YB7231 


Homo sapiens 


Human signal peptide 
containing protein HSPP-fl 
SEQ ID NO: 8. 


1031 


100 


95 


AF227741 


Rattus 
norvegicus 


protein kinase WNK1 


2428 


95 


96 


AF227741 


Rattus 
norvegicus 


protein kinase WNK1 


1961 


94 


97 


Y92513 


Homo sapiens 


Human OXRE-10. 


1626 


100 


98 


AL021366 


Homo sapiens 


cICK072lQ,3 (Kinesin related 
protein) 


3423 


100 


99 


AC00S733 


Homo sapiens 


R33083 1 


1974 


99 


100 


Y95293 


Homo sapiens 


Human GEF 1 containing NEK- like 
kinase substrate sGNK. 


4092 


99 


101 


AL118S01 


Homo sapiens 


dJH9iN16.l (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em:AL050069) ) 


1S09 


100 


102 


AJ006267 


Homo sapiens 


ClpX-like protein 


3233 


100 


103 


AF100753 


Homo sapiens 


ancient ubiquitous 46 JcDa 
protein AUP1 


2042 


96 


104 


AB015982 


Homo sapiens 


serine/ threonine kinase 


4718 


100 


105 


AF151074 


Homo sapiens 


HSPC240 


B31 


64 


106 


M35522 


Canis 
familiar is 


GTP-binding protein (rab7) 


354 - - 


50 


107 


R99800 


Homo sapiens 


NTll-i nerve protein, 
facilitates regeneration of 
nerve cells. 


2337 


93 


108 


AP125533 


Homo sapiens 


NADH- cytochrome bS reductase 
isoforra 


1290 


93 


109 


Ad0056l4 


Homo sapiens 


F23269 2 


3369 


99 


110 


AF0G4729 


Homo sapiens 


RAN binding protein 16 


3285 


100 


111 


X52425 


Homo sapiens 


interleukin 4 receptor 


4496 


100 


112 


Y41686 


Homo 
sapiens 


Human PR0274 protein 
sequence. 


2285 


100 


113 


W15506 


Homo sapiens 


Mitogen activating protein 
kinase ERK1. 


1991 


100 


114 


Y71071 


Homo sapiens 


Human membrane transport 
protein, MTRP-16. 


1190 


99 


115 


AL049548 


Homo sapiens 


dJ398G3.1 (ortholog of rat 
CPG2) 


3497 


99 


116 


AF189817 


Mus musculus 


evectin-2 


1124 


90 


Ti7 


W3 0891 


Homo 


Human cytostatin III protein. 


715 


99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WAT2RMAN 
SCORE 


IDENTITY 






sapiens 








118 


AF116618 


Homo sapiens 


PRO1038 


1469 


100 


119 


Y08915 


Homo sapiens 


alpha 4 protein 


1748 


100 


12C 


AF098070 


Drosophila 
melanogaster 


Li si horaolog 


192 


39 


121 


AF052432 


Homo sapiens 


katanin p80 subunit 


181 


37 


122 


Y70743 


Homo sapiens 


PSEQ-1 protein encoded by 
NSEQ gene associated with 
matrix remodelling. 


2637 


98 


123 


AF083246 


Homo sapiens 


HSPC028 


2132 


100 


124 


Y27096 


Homo sapiens 


Human viral receptor protein 
(ACVRP) . 


833 


99 


125 


M63109 


Leishmania 
major 


glycoprotein 96-92 


172 


27 


126 


U75467 


Drosophila 
melanogaster 


Atu 


935 


36 


127 


Z6822D 


Caenorhabdit 
is elegans 


Similarity to Human ADP/ATP 
carrier protein 


438 


43 


128 


AF09S927 


Rattus 
norvegicus 


protein phosphatase 2C 


1927 


94 


129 


W929S8 


Homo sapiens 


Human 2sig44 protein. 


463 


100 


130 


AF115391 


Lactobacillu 
s sakei 


ribokinaoe RbsK 


508 


37 


131 


X93498 


Homo sapiens 


21-Glutamic Acid-Rich Protein 


1250 


100 


132 


X93498 


Homo sapiens 


21 -Glutamic Acid-Rich Protein 


"916 


87 


133 


W52B11 


Homo sapiens 


Human DBI/ACBP -like protein 
(DBIHI . 


705 


97 


134 


Y84444 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein. 


3230 


ZOO 


13 5 


M69181 


Homo sapiens 


non- muscle myosin B 


189 


2 0 


13 6 


W74882 


Homo sapiens 


Human secreted protein 
encoded by gene 154 clone 
HE6FL83. 


480 


100 


13 7 


W78200 


Homo sapiens 


Human secreted protein 
encoded by gene 75 clone 
HHGAU81. 


855 


99 


138 


AL033520 


Homo sapiens 


dJ349A12.1 (similar to 
KIAA0701 protein) 


424 


39 


139 


AF020261 


Santalum 
album 


proline rich protein 


119 


30 


140 


X70394 


Homo sapiens 


zinc finger protein 


1634 


100 


141 


Y06439 


Homo sapiens 


Human protease HUPM-8. 


936 


100 


142 


Z68493 


Caenorhabdit 
is elegans 


predicted using Genefinder 


365 


42 


143 


AB018107 


Arabidopsis 
thai i ana 


ADP-riboaylation tactor-like 
protein 


596 


65 


144 


AF161483 


Homo sapiens 


HSPC134 


580 


51 


Us ■ 


Y84902 


Homo sapiens 


A .human prol iteration and 
apoptosis related protein. 


480 


100 


146 


AB004906 


Ipotnoea 
purpurea 


transposase 


146 


20 


147 


AC007357 " 


Arabidopsis 
thaliana 


F3F19.18 


647 


31 


148 


W75155 


Homo sapiens 


Human secreted protein 
encoded by gene 41 clone 

HNTME13 . 


1494 


98 


149 


AF056490 


Homo sapiens 


cAMP-specif ic 
phosphodiesterase 8A 


3710 


99 


ISO 


Y58171 


Homo 
sapiens 


Human hydrolase homologue 
HHH-7. 


785 


99 


151 


U10397 


Saccharotnyce 
s cerevisiae 


Yhri4Swp 


515 


53 


"152 


X73478 


Homo sapiens 


phosphotyrosyl phosphatase 
activator 


1719 


99 


153 


AL049697 


Homo sapiens 


dJ382Iio.5.i (novel protein 


2034 


99 
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TABLE 2 



PCT/US00/34263 



SEQ 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 




ID 


NUMBER 






WATERMAN 


IDENTITY 


NO: 








SCORE 










similar to arginyl- tRNA) 






"154 " 


AF169802 


Homo sapiens 


cytochrome b5 reductase b5R.2 


1455 


93 


155 


X94703 


Homo sapiens 


rab28 


1126 


99 


156 


Y25716 


Homo sapiens 


Human secreted protein 
encoded from gene 6. 


1471 


100 


158 


W77404 


Homo sapiens 


Secreted salivary polypeptide 
zsig3 2 . 


937 


100 


159 


Y17248 


Homo sapiens 


Human protein kinase 
inhibit or- 2 (PKI-2) . 


383 


100 


160 


J04970 


Homo sapiens 


carboxvpeptidase M precursor 


2395 


100 


1S1 


WS4040 


Homo sapiens 


Human interferon-inducible 
protein, HIFI . 


484 


98 


162 


AL022724 


Homo sapiens 


d\7413H6.l.l (hamster 
Androgen- dependent Expressed 
Protein LIKE putative 
protein) (isoform 1) 


1357 


100 


163 


AF125535 


Homo sapiens 


pp2 1 homo log 


193 


45 


i£4 


G03632 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7713. 


4^3 


97 


165 


AJ250839 


Homo sapiens 


serine/threonine protein 
kinase 


1442 


71 


166 


Ii09649 


mobilis 




X 'J 


o / 


167 


Y73337 


Homo fi i /»n 


HTRM rlnnp 1 QAdS^n nrnf pi n 

ill I VI J l~ -L. LJ 4 1^ j *± ^ j \j UJLWlvwJUIl 

sequence . 


1204 


100 


168 


W86645 


Homo sapiens 


Seeretei Yurofcein. encoded )a\r 

W»«l*w\rildA U V 

gene 112 clone HUKFC71. 


1084 


100 


169 


AF214731 


Homo sapiens 


ATP-dependent RNA hei lease 


4402 


100 


170 


AE000871 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


166 


27 


171 


Y27684 


Homo sapiens 


Human secreted protein 
encoded by gene No. 118. 


821 


100 


172 


AF226044 


Homo sapiens 


HSNFRK 


2904 


100 


173 


AJ245946 


Homo sapiens 


neuroglobin 


779 


100 


174 


D43949 


Homo sapiens 


This gene is novel . 


3202 


100 


175 


Y07923 


Homo sapiens 


GTP-binding protein 


1205 


100 


17£ 


W9033S 


Homo 
sapiens 


Human DPI homologue protein. 


96S 


100 


177 


Y41675 


Homo sapiens 


Human channel -related 
molecule HCRM-3 , 


1122 


100 


178 


Y416 74 


Homo sapiens 


Human channel-related 
molecule HCRM-2 . 


936 


99 


179 


AF220492 


Homo sapiens 


krueppel-like zinc finger 
protein HZF2 


4100 


99 


180 


X03084 


Homo sapiens 


Clq B- chain precursor 


1240 


100 


181 


U57344 


Mus musculus 


Meis3 


1813 


89 


183 


U57344 


Mus musculus 


Meis3 


1743 


86 


184 


U57344 


Mus musculus 


Meis3 


1070 


86 


185 


AF033120 


Homo sapiens 


p53 regulated PA26-T2 nuclear 
protein 


1389 


58 


186 


AF200357 


Mus musculus 


pantothenate kinase 1 beta 


1605 


82 


187 


W7505 8 


Homo sapiens 


Human secreted protein 
encoded by gene 2 clone 
HLDBG33. 


1188 


99 


188 


AJ292529 


Homo sapiens 


suppressor of sterile four 1 


2424 


100 


190 


X54134 


Homo sapiens 


protein- tyrosine phosphatase 


3705 


100 


191 


Y22203 


Homo sapiens 


Human calcium-binding 
phosphoprotein, CBPP-i, 
protein sequence. 


1083 


99 


192 


W63692 


Homo 
sapiens 


Human secreted protein 12. 


1975 


100 


193 


WB7772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2 ) 
polypeptide . 


2605 


99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 




194 


AF084259 


Mus thus cuius 


bromodoma in- conta i n ing 
protein BP75 


693 


54 


195 


Y0 0752 


Rattus 
norvegicus 


serine dehydratase (AA 1 - 
327) 


994 


61 


196 


W95349 


Homo sapiens 


Human foetal brain secreted 
protein fhl70_7. 


2596 


100 


197 


AB028859 


Homo sapiens 


hDj9 


1890 


100 


198 


W95633 


Homo sapiens 


Homo sapiens secreted protein 
gene clone hm236_ 1. 


1614 


100 


199 


Y44277 


Homo 
sapiens 


Human nucleic acid methylase- 
2. 


2096 


99 


200 


ABO30O39 


Homo sapiens 


hPACPLl 


22S8 


100 


201 


X54162 


Homo sapiens 


64 Kd autoantigen 


2918 


99 


202 


G02061 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6142. 


558 


99 


203 


X1388S 


Nicotiana 
t aba cum 


extensin {AA 1-620) 


185 


33 


204 


J04204 


Bos taurus 


32 kd accessory protein 


1837 


100 


205 


J04204 


Bos taurus 


32 kd accessory protein 


1101 


100 




Y87283 


Homo sapiens 


Human s ignal pept ide 
containing protein HSPP-60 
SEQ ID NO -.60. 


1318 


100 


206 


Y02860 


Homo sapiens 


Fragment of human, secreted 
protein encoded by gene 65. 


936 


98 


209 


AL121889 


Homo sapiens 


dJ1076E17.1 (KIAA0823 protein 
(continues in AL023803)) 


694 


~54 


210 


AF22S732 


Homo sapiens 


NPD007 


1345 


76 


211 


X66295 


Mus mus cuius 


Ciq C chain 


970 


73 


212 


229328 


Homo sapiens 


Ubiqui tin- conjugating enzyme 
UbcH2 


966 


100 


213 


Z29328 


Homo sapiens 


Ubiqui tin- conjugating enayme 


542 


96 


214 


AJ002030 


Homo sapiens 


progresterone binding protein 


1163 


100 


215 


X70549 


Homo s ap i ens 


member of DEAD box protein 
family 


3 933 


100 


21S 


AF250558 


Homo sapiens 


claudin-2 


11*9 


99 


217 




Homo sapiens 


ajaJiDij..i \ putative protein) 


259 


100 


213 


Y08565 


Homo sapiens 


UDP-GalNAc polypeptide N- 

acetylgalactosaminyltransfera 

se 


3331 


99 


219 


Y94452 




Human inflammation associated 


2067 


100 


220 


AL035521 


Arabi clops i s 
thai! ana 


£^ ^ LO Li VC U LCXIi 


315 




221 


AL031786 


Schizosaccha 

romyces 

pombe 


putative proline- trna 
synthetase 


811 


41 


222 


AL109736 


Schizosaccha 

romyces 

pombe 


WD repeat protein 


626 


40 


223 


X52493 


Glycine max 


DNA- directed RNA polymerase 


136 


23 


224 


AL03 5SS9 


Homo sapiens 


dJ979Nl.l (dJ979Nl.l) 


5199 


98 


225 


AB0324Q1 


Mus musculus 


mmDj4 


1761 


92 


226 


AB032401 


Mus mus cuius 


mniD j 4 


1988 


92 


227 


X83502 


Saccharomyce 
s cerevisiae 


J1007 


112 


26 


228 " 


X83502 


Saccharomyce 
s cerevisiae 


J1007 


79 


2S 


229 


AF143723 


Homo sapiens 


heat shock protein HSP60 


2557 


99 


230 


Y66677 


Homo 
sapiens 


Membrane -bound protein 
PR0828. 


982 


100 


231 


AB027466 


Homo sapiens 


spondin 2 


1756 


99 


232 


W95634 


Homo 
sapiens 


Homo sapiens secreted 
protein. 


1391 


100 


233 ' 


K00365 


Homo sapiens 


Human cyclin Bl . 


2218 


99 


234 


YS376i 


Homo sapiens 


A GTP -binding polypeptide 


1017 


100 
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TABLE 2 
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SEQ 
ID 
NO ; 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








des ignated RAQ . 






23S 


Z50749 


Homo sapiens 


yeast sds22 homolog 


1800 


100 ~~ 


236 " 


Z50749 


Homo sapiens 


yeasc sas22 homolog 


1754 


98 


237 


AB026491 


Homo sapiens 


PICK1 


2137 


100 


238 


AJ270205 


Entodinium 
cauda turn 


putative 

phosphatidyl inositol -4 - 
phosphate 5 -kinase 


114 


37 


239 • 


AB030189 


Mus musculus 


contains transmembrane (TM) 
region and ATP binding region 


710 


" 93 


240 


W56538 


Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3785 


99 


241 


W56538 


Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3436 


99 


242 


AF155107 


Homo sapiens 


NY -REN- 3 7 antigen " | 996 " " 


99 


243 


AF155107 


Homo sapiens 


NY- REN- 3 7 antigen j lOOS 


100 


244 


AL0313 20 


Homo sapiens 


a\T2 0N2.1 (novel protein 
similar to yeast and 
bacterial cytosine 
deaminase) 


763 

1 


99 


245 


U3 7026 


Rattus 
norvegicus 


sodium channel beta 2 subunit 


162 


30 


246 


AL078599 


Homo sapiens 


dJ991C6.1 (novel protein 
similar to C. elegans 
FSSA12.9 (Tr;P91086) ) 


2391 


98 


24 7 


U32274 


saccharomyce 
s cerevisiae 


Ydr385wp; CAI: 0.12 


191 


3 7 


24 8 
249 


Y41719 
AB029434 


Horao 
sapiens 
Homo sapiens 


Human PROS 64 protein 
sequence . 
ghrelin precursor 


1079 


100 


250 


X97831 


Rattus 
norvegicus 


carnitine/acylcarnitine 
carrier protein 


611 
246 


100 
38 


251 


W80993 ^ 


Homo 
sapiens 


Human RIP- interacting factor 
RIF . 


1724 


100 


252 


Y94873 


Homo 
sapiens 


Human protein clone HP02632, 


1876 


100 


253 


W59878 


Homo sapiens 


Amino acid sequence of the 
cDNA clone AIF-2 (HEBGM49) . 


765 


100 


254 


AL354533 


Laishmania 
maj or 


possible adenylate kinase 


265 




255 


AF233322 


Mus musculus 


zinc transporter like 2 


1916 


95 


256 


Y78113 


Homo sapiens 


Human cytokine signal 
regulator CKSR-1 SEQ ID 
NO:l. 


2247 


99 


257 


AL035539 


Arabidopsis 
thai i ana 


putative amino acid transport 
protein 


390 


27 


258 


W74787 


Homo sapiens 


Human secreted protein 
encoded by gene 58 clone 
HHFHN61 . 


1171 


10 fl 


259 


lU^03S689 


Homo sapiens 


CU187J11.1 {novel protein 
similar to protein kinase C 
inhibitors) 


974 


100 


260 
26l 


AE000909 
AL0S0131 


Methanobacte 
rium 

t herraoau t o t r 
ophicum 
Homo sapiens 


serine/threonine protein 
kinase related protein 

hypothetical protein 


363 


30 


262 




Mus musculus. 


zeta proteasome chain; PSMA5 


626 
1214 


100 
100 


263 


AL035S93 


Homo sapiens 


ctJ3lOJ6,i (novel protein) 


821 


100 


264 


AL022318 


Homo sapiens 


bKlS0C2.3 (PUTATIVE novel 
protein similar to APOBECl) 


1072 


100 


265 


AF205940 


Homo sapiens 


endomucin 


1289 


100 


266 


AL023583 


Homo sapiens 


dJ500L14.1 (novel protein) 


789 


100 


267' 


AL034548 


Homo sapiens 


dJH03G7.3 (novel protein 
kinase domains containing 
protein similar to 
phosphoprotein C8FW) 


1888 


99 
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SEQ 
ID 
NO: 



ACCESSION 
NUMBER 



AF16147Q 
AF161470 



SPECIES 



Homo sapiens 



DESCRIPTION 



HSPC121 



SMITH 
WATERMAN 
SCORE 



1B84 



IDENTITY 



98 



270 
271 



X90763 



Homo sapiens 



HSPC121 



Homo 
sapiens 



HHa5 hair keratin type I 
intermediate filament 



1232 



2190 



96 



99 



AF207600 
M32334 



Homo sapiens 



ethanolamine kinase 



1952 



100 



Homo sapiens 



intercellular adhesion 
molecule 2 



1436 



100 



273 
274 



AF161483 
YS3052 



Homo sapiens 
Homo sapiens 



HSPC134 



663 



6T" 



276 



Y77S76 



Homo saoiens 



Human secreted protein clone 
df2 02_3 protein sequence SEQ 
ID NO:110. 



Human cytoskeletal protein 
(HCYT) (clone 2195418) . 



762 



100 



100 



AF077042 



Homo sapiens 
Homo saoiens 



3 OS ribosomal protein S7 
homolog 



1269 



100 



Human secreted protein clone 
cal06^i9x protein sequence 
SEQ ID NO: 20. 



1619 



98 



Homo sapiens 



Amino acid sequence of a 
human pho sphoryl a t ion 
effector PHSP-20. 
rod transducin 



2801 



'99 



Z75134 
Z75134 



Can is 

familiaris 



1816 



100 



Can is 

familiaris 



rod transducin 



1718 



96 



282 
283 



AF249873 
ALO50007 



Homo sapiens 
Homo sapiens" 



muscle- specific protein 



1395 



100 



hypothetical protein" 



405 



98 



Homo sapiens 
Homo sapiens 



DC1 



1859 



99 



285 
286 



287 



AF1561Q2 
Y35897 



Homo sapiens 



tl88964 _ 



Homo sapiens 



ELL complex EAP3Q subunit 



Extended human secreted 
protein sequence, SEQ ID NO. 
146. 



1318 



1250 



HEM4S 



923 



99 



99 



100 



289 
290 



AJ011098 
Y66724 



Homo sapiens 



Homo sapiens 



hypothetical protein 



telethonin 



598 
57T- 



100 



100 



291 
292 
293 



AF034801 



Homo 

sapiens 

Homo sap i ens 



Tlembrane -bound protein 
PR0836. 



2321 



liprin-alpha4 



2565 



ToTT 



98 



Homo sapTens 
Homo sapiens" 



Tiprin-alpha4 



2590 



100 



dJ889J22B.l (novel protein 
(isoform 1) ) 



1738 



100 



294 



295 
296 



Homo sapiens 



LI 16 72 " 
AL035423 



Homo sapiens 



htrm clone 8396'5l protein 
sequence . 



1245 



zinc finger protein 



1694 



99" 



44 



Homo sapiens 



dJ20l3.1 (brain mitochondria! 
carrier protein- 1 (BMCPi) ) 



1024 



79 



297" 
298 



Homo sapiens 



lymphoid enhancer binding 
factor- 1 



2173 



100 



AF161417 
AF159141 



Homo sapiens 



HSPC299 " 

breast cancer metastasie- 



1147 



85 



301 



302 
~303~~ 



Homo sapiens 



1236" 



U26397 



AF036145 



Rattus 
norvegicus 



suppressor 1 



Homo sapiens 



inositol polyphosphate 4- 
phosphatase 



160 



meningioma- expressed antigen 
S 



3458 



Z82022 
AF269232 



Homo sapiens 



GlcNac-l-p transferase 
butyrophilin-like protein 



2067" 



30 



100 
99 



304 



Mus musculus 



AJ222644 



Arabidopsis 
thaliana 



BUTR-1 

asparaginyl-tRNA synthetase" 



271 



659 



50 



50 



Homo 
sapiens 



hematopoietic cell derived 
zinc finger protein 



51 



79 



308 



AJ272079 - 



Y44486 
AJ131891 



Homo sapiens 



Homo 
sapiens 



APOBEC-1 stimulating protein 



Human GPRW receptor 

polypeptide . 

DNA polymerase mu 



056 



1721 



100 



Homo sapiens 



2598 



100 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 

SCORE 


% 


310 


AF293335 


Homo sapiens 


p3 0 D8C 


1248 


92 


311 " 


AF17652S 


Mus musculus 


F-box protein FBt>12 


1501 


93 


312 


X57802 


Homo sapiens 


immunoglobulin lambda light 
chain 


959 . 


81 


313 


Z3S715 


Homo sapiens 


Net 


2048 




314 


AF161532 


Homo sapiens 


HSPC047 


72 7 


100 


315 


AF208068 


Homo sapiens 


kelch-like protein KL>HL3 a 


304 6 


100 


316 


Y66666 


Homo 
sapiens 


Membrane -bound protein 
PRO1013. 


1166 


100 


317 


Y29666 


Homo sapiens 


Human Ras protein rapr-i. 


1253 


98 


318 


AJ3 877 4 7 








59 


319 


AF161362 


Homo sapiens 


HSPC099 


224 


40 


320 


Y68773 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector FHSP-5. 


2243 


99 


321 


T1 1 O "5 1 ft 

AO 23 B379 


Homo sapiens 


putative THl protein 


3013 


100 


322 


AB040812 


Homo sapiens 


protein kinase PAKS 


3792 


99 


323 


Y95013 


Homo sapiens 


Human secreted protein 
vc48_l, SEQ ID N0:6S. 


913 


100 


324 


Y13381 


Homo sapiens 


Amino acid sequence ot 
protein PR0271. 


1976 


100 


325 


Y94944 


Homo sapiens 


Human secreted protein clone 
bfl57__16 protein sequence 
SEQ ID NO; 94 . 


2305 


98 


32S 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein- 7 sequence . 


6728 


99 


327 


AF198532 


Homo sapiens 


iympnoid enhancer binding 
factor-1 


2173 


100 


328 


Z78013 


Caenorhabdit 
is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 


569 


33 


329 


AF212921 


Mus musculus 


MMTV receptor variant 1 


484 


94 


330 


Z75330 


Homo 

sapiens] 

>R6S207 

R65207 02- 

MAR-1995 27- 

AUG-1993 

Human 

stromalin-l . 

[Homo 

sapiens 


nuclear protein SA-l 


6492 


99 


331 


HT.nftBRai 

■nJUU UOSOJ 


Homo sapiens 


dJ327J16.3 (supported by 
GENS CAN, FGENES and GENEWISE ) 


2133 


99 


332 




Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
489. 


310 


41 


333 


AJ271669 


Homo sapiens 


putative sialoglycoprotease 


1747 


100 


334 


AF156598 


Kus musculus 


p53 -regulated DDA3 


997 


64 


33S 


M99058 


Eimeria 
maxima 


emlOO gene is homologous the i 154 
Eimeria tenella gene etlOO j 


26 


336 


Y85564 


Homo sapiens 


Human homologue of UNC-'53 
(Hs-UNC-53/l) sequence. 


3386 


97 




Y85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs -UNC- 53 / 1 > sequence . 


2602 


94 


338 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
<Hs-UNC-53/l> sequence. 


3447 




339 


ZS6561 


Caenorhabdit 
is elegans 


Similarity to Human rabl3 
protein (PIR Acc. No. 
A49647) . 


716 


34 


340 


AB021643 


Homo 
sapiens 


gonadotropin inducible 
transcription repressor- 3 


2761 


99 


341 


G01946 


Homo sapiens 


Human secreted protein, SEQ 
ID NO; 6027. 


465 


98 


342 


AF020591 


Homo sapiens 


zinc finger protein 


1091 


48 


343 


L291S4 


Homo sapiens 


immunoglobulin heavy chain 


439 ■ 


84 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 

SCORE 


% 








VDJ region 






344 


U1D281 


sua scrofa 


gastric mucin 


2 79 


24 


3 45 


AK000404 


Homo sapiens 


unnamed protein product 


1177 


99 


346 


L22SS7 


Rattus 
norvegicus 


calmodulin- binding protein 


1949 


84 


347 


L22557 


Rattus 
norvegicus 


calmodulin-binding protein 


2363 


91 


348 


AL049481 


Arabidopsis 
thai i ana 


AIGl-like protein 


316 


30 


350 


AJ251516 


Mus mus cuius 


cysteine and histidine-rich 
protein 


1460 


99 


351 


AK024477 


Homo sapiens 


FLJ00070 protein 


1773 


100 


3S2 


U50133 


Homo sapiens 


ankyrin 


502 


33 


353 


AK0O0625 


Homo sapiens 


unnamed protein product 


721 


100 


354 


AF161420 


Homo sapiens 


HSPC302 


2623 


97 


355 


AJ010014 


Homo sapiens 


M96A protein 


1269 


47 


35S 


AF151029 


Homo sapiens 


HSPC195 


941 




357 


AL022327 


Homo sapiens 


dJ35SC10.1 (KIAA0027) 


1911 


100 


358 


V778128 




Human secreted protein 
encoded by gene 3 clone 
HOSBI96. 


J. j. J. / 


100 


359 


X03414 


Drosophila 
melanogaster 


Kr polypeptide 


316 


S3 


360 


AF151079 


Homo sapiens 


HSPC245 


643 


100 


361 


Y53886 


Homo sapiens 


A suooressor of cvtolcinp 
signalling protein 
designated HSCOP-6. 


530 


4 1 


352 


AF254741 " 


Drosopnila 
melanogaster 


Centaur in Gamma 1A 


681 


A C 


363 


AF213465 


Homo s ap i ens 


dual oxidase 


2016 


100 


364 


AF181562 


■Homo sapiens 


oroSAAS 


1319 


100 


365 


AF181562 


Homo sapiens 




1024 


99 


366 


U73200 


Mus musculus 


pll6Rip 


884 


82 


367 


AF263744 


Homo sapiens 


erbb2-interftcting protein 
ERBIN 


4973 


99 


368 


U37501 


Mus mile mil i i a 


1 am 1 ! m' n atnKa C nW-^-Jv> 

xairaniQ aipna b cnain 


5B67 


72 


369 


AF043S95 


Caenorhabdit 
is eljsgans 


similar to the protein 

pxius^nates camiiy 


S49 


36 


370 


Y73440 




reunion aet-ieueti tote i n cions 
yj23_l protein sequence SEQ 
ID NO: 102 . 


1484 


99 


371 


AF272833 


Homo sapiens 


misato 


«OD7 


— — — 

97 


372 


AF198454 


Homo sapiens 


epithelial protein, loBt in 
neoplasm beta 


3 927 


100 


373 


Y73345 


Homo sapiens 


HTRM clone 438283 protein 
sequence . 


273 


80 


374 


AF169017 


Homo sapiens 


f ormlminot ran sf erase 
cyclodeaminase 


2 717 


98 


375 


A95106 


unidentified 


RED ALPHA 


1202 


99 


376 


W7482B 


Homo sapiens 


Human secreted protein 
encoded by gene 100 clone 
HLQA352 . 


1012 


99 


377 


Y32131 


Homo sapiens 


Human LYST-2 protein. 


3556 


99 


378 


M14912 


Homo sapiens 


pol 


132 


86 


379 


AF090934 


Homo sapiens 


PRO 05 18 


382 


100 


380 


K66363 


Homo sapiens 


serine/ threonine protein 
kinase 


2499 


100 


361 


Y41699 


Homo 
sapiens 


Human PRO703 protein 
sequence . 


2362 


100 


382 


AF17449B 


Homo sapiens 


GR AF-l specific protein 
phosphatase 


7008 


98 


383 


U64606 


caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
ykl73cl2.5 


246 


36 


384 


U50133 


Homo sapiens 


ankyrin 


502 


33 


385 


AJ238520 


Homo sapiens 


putative transcription 
factor- like nuclear regulator 


4123 


97 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


23MJ.TH- 
WATERMAN 
SCORE 


IDENTITY 


387 


AP208845 


Homo sapiens 


BM-0O3 


1375 




389 


X57821 


Homo sapiens 


immunoglobulin lambda light 
chain 


797 


76 


390 


AF182404 


Homo sapiens 


mitochondrial uncoupling 
protein 1 


1670 


99 


391 


* Y85564 


Homo sapiens 


Human homologue of UNC-53 
( Hs - 0NC- S3 / 1 ) sequence . 


3386 


97 


393 


AF178432 


Homo sapiens 


SH3 protein 


3700 


10 0 


394 


AF229928 


Drosophila 
melanogas ter 


cytoplasmic protein 89BC 


1616 




395 


AF181721 


Homo sapiens 


RU2S 


2254 


100 


396 


Y69197 


Homo sapiens 


Amino acid sequence of a 
human beta IV- spectrin 


1626 


98 


397 


U48238 


Mus musculus 


*. protein neuro—ci4 


749 


60 


398 


AL3 90137 


Homo sapiens 


hypothetical protein 


263 


51 


399 


AF217525 


Homo sapiens 


Down syndrome cell adhesion 
molecule 


5337 


60 


400 


AL022599 


Schizosaccha 
pombe 


WD repeat protein 


447 


27 


401 


AC004B59 


Homo sapiens 


similar to 2-oxoglutarate 
dehydrogenase / similar to 
Q02218 <PID:gl352S18) 


4176 


78 


402 


AB010266 


muocuxus 


tenascin-X 


1024 6' 


62 


403 


AL133286 


«winv adpxcns 


aubviuv t i isamilar to 
D. melanogas ter CG5986 

£/L ULc 111 / 


761 


100 


404 


Z6B753 


is elegans 


.JO 


838 


48 


405 


Z7B013 


Caenorhand.it*. 
is elegans 


i»i^-iQi _L ^- y t-tj LiiUsupniia 

Cadherin- related tumor 
suppressor 


569 


33 


406 


AB031230 


Homo sapiens 


protein containing CXXC 
domain 2 


1196 


97 


407 


AF155106 


Homo sapiens 


NY-REN-36 antigen 


1168 


"Too 


408 


Y57945 


Homo sapiens 


Human Cransmexnbrane protein 
HTMPN-69 . 


1538 


99 


409 


Z18361 


Ovis aries 


trichohyalin 


184 


30 


410 


AF249744 


Homo sapiens 


RhoGEF 


2733 


100 


411 


AF176529 


Mus musculus 


F-box Drotein PBvn 




94 


412 


AF210842 


Homo sapiens 


HARP 


4880 


100 


413 


AL63165B 


Homo sap i en s 


diJ31.nm"i 7 irtewf^l nv^t'A'i n 

s imil ar to H vo>-<» f* 5* -5 wo dttt _ 
k? •tiim.^.a.k im.\*r jn • iuicli&j, nzv.tr d jl 

3) 


776 


93 


414 


X57398 


Homo sapiens 


pm5 protein 


6131 


99 


415 


AB029826 


Homo s ap i ens 


3 -mefhvlerntTinvl »rna " 
carboxylase biot in- containing 
eubunit 


2961 


99 


416 


U43503 


Saecharomyce 
s cerevisiae 


Lphlp 


115 


42 


417 


AL160493 


Leishmanla 
major 


possible t26fl7.21 


23 9 


"Tc 


418 


Y08100 


Homo sapiens 


Human PR0331 protein. 


330 


29 


419 


U1S131 


Homo sapiens 


pl26 


2228 


54 


420 


AF117946 


Homo sapiens 


Link guanine nucleotide 
exchange factor II 


2363 


100 


~421 


AF190635 


urosophila 
melanogas ter 


anJcyrin 2 


755 


30 


422 


AF302150 


Homo 
sapiens 


phosphoinositol 3 -phosphate- 
binding protein- 2 


1962 


100 


423 


AL13753 0 


Homo sapiens 


hypothetical protein 


433 


94 


424 


XS3753 


Homo sapiens 


son-a 


7269 


100 


425 


AB027249 


Homo sapiens 


MAPKK like protein kinase 


1693 


100 


426 


AF279144 


Homo sapiens 


tumor endothelial marker 7 
precursor j 


1084 


55 
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TABLE 2 



PCT/USOO/34263 



ID 

NO: 


NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 


427 




Homo sapiens 


tumor endothelial marker 7 
precursor 


1259 


56 


428 


AE003683 


Drosophila 
melanogaster 


CG8312 gene product 


149 


29 


429 


Y07829 


Homo sapiens 


RING finger protein 


2201 


99 


430 


apnoc nan 


Drosophil a 
melanogaster 


pushover 


4442 


47 


431 


U41387 


Homo sapiens 


Gu protein 


4021 


99 




AC U^JD / 4 


Homo sapiens 


nephroeys tin 


3783 


100 


il "S "1 


Ac 14o /o0 


Homo 
sapiens 


septin 2-1 ike cell division 
control protein 


2284 


100 






Arabi dop sis 
thaliana 


cleft lip and palate 
associated transmembrane 
protein-like 


886 


42 


437 


Y9 4247 


Homo sapiens 


Human calcium binding protein 
hCBP. 


1704 


100 


438 


" W M U W / £_a 


Homo sapiens 

— ~ 


UDP-GalNAc: polypeptide N- 
acetylgalactosaminyl trans f era 
se 


1075 


63 


43 9 




Bos taurus 


tuftelin 


285 


33 


440 


R06463 


Homo sapiens 


Derived protein of clone 
ICA13 (ATCC 40553) . 


3073 


99 


4 4 X 


X14 971 


Mus muaculus 


alpha-adaptin (A) (AA 1-977J 


4897 


98 


442 


X53773 


Rattus 
norvegicus 


alpha-c large chain (AA 1- 
938) 


3979 


81 




Y6 6689 


Homo 
sapiens 


Membrane-bound protein 
PR01136. 


3299 


99 


d A A 


» /*• rt CT "7 C 

AtUb / /b4 


Arabi dop 3 i 3 
thaliana 


unknown protein; 20348-23707 


114 


33 


445 


AF229032 


Mus mus cuius 


piL 


2077 


93 


446 


AF056035 


Rattus 
norvegicus 


s-nexilin 


2662 


85 


447 


AF132484 


Mus mus cuius 


unknown 


478 


51 


448 


WS9024 


Homo sapiens 


Polypeptide fragment encoded 
by gene 156 . 


528 


45 


449 


AF161445 


Homo sapiens 


HSPC327 


1606 


100 


450 


Z68753 


Caenorhabdit 
is elegans 


ZC518 .3b 


951 


49 


"45i 


W39160 


Homo sapiens 


Human partial complement 
factor H protein fragment 3 . 


155 


32 


4S2 


W85727 


Homo 
sapiens 


Novel protein [Clone 
BM46_10) . 


2799 


99 


453 




Homo sapiens 


A bone marrow secreted 
protein designated BMS115. 


2810 


100 


454 


D87438 


Homo 
sapiens 


Similar to a C.elegans 
protein in cosmid C14H10 


4069 


100 


455 


AF240468 


Homo sapiens 


nicastrin 


3687 


100 


456 




Homo sapiens 


CENP-E 


13305 


99 


457 


M59216 


Homo 
sapiens 


gamma-aminobutyric acid 
receptor beta-1 subunit 


2477 


100 


458 


Y734 67 


Homo sapiens 
_ _ 


Human secreted protein clone 
yd61_l protein sequence SEQ 
ID NO: 156 . 


966 


100 


459 


W67824 


Homo sapiens 


Human secreted protein 
encoded by gene IB clone 
HSLFM29. 


535 


100 


460 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


279 


19 


461 


D87446 


Homo sapiens 


Similar to a C.elegans 
protein encoded in cosmid 
C27F2 (U40419) 


9196 


99 


462 


GO4044 


Homo sapiens 


Human secreted protein, SEQ 

ID NO: 8125. 


486 


93 


463 


AC002398 


Homo sapiens . 


F25965 1 


1018 


100 


464 


AF064856 


Rattus sp. 


7acomp protein 


1845 


84 


465 


AF2234 08 


Homo sapiens 


B99 


3686 


99 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 
466 " 
~467 


1 ACCESSION 
NUMBER 

AF223408 


SPECIES 
Homo sapiens 


DESCRIPTION 

B99 


CM T TU 

WATERMAN 
SCORE 
2878 


IDENTITY 
87 


468 
469 


AF104415 
U53450 

AL031297 


Mus musculus 
Rattus 
norvegicus 
Homo sapiens 


gene trap locus- 13 

dun dimerization protein 1 

JDP-i 

dJ97P20.1 (novel gene) 


6336 
196 


91 

49 1 


470 


AF257077 


Homo sapiens 


eulca ryo t i c translation " " 
initiation factor EIF2B 
subunit 3 


3564 
1274 


99 
95 


471 


L28125 


anserina 


beta transducin-like protein 


284 


38 


472 


Y84903 


Homo sapiens 


A human proliferation and 
apoptosis related protein. 


2337 


100 


473 


AF144237 


Homo sapiens 


LOMP protein 


252 


44 


474 


Y71213 


Homo sapiens 


Human irritable bowel disease 
related polypeptide IMX3 9. 


838 


100 


4 75 


Y95006 


Homo sapiens 


Human secreted protein 
vel3_l, SEQ ID NO: 52. 


3411 


100 


476 
477 


D38549 
AP24123 0 


Homo sapiens 
Homo sapiens 


iiaj,u«3 is xiew 

TAK1- binding protein 2 


6533 


99 


478 


AL031534 


Schizosaccha 

romyces 

pombe 


putative asparagine synthase 


3656 
482 


100 
40 


479 
480 

481 ■ 


1,28125 
AF161544 


Podospora 
anserina 
Homo sapiens 


beta transducin-iiJce protein 
HSPC059 ~ 


233 
434 


26 
77 


482 


AJ23824 8 
Z3 8061 


Homo sapiens 
Saccharomyce 
s cerevisiae 


centaurin beta2 

0.3, AMYH_YEAST P0864 0 


3986 
295 


99 
23 


4 83 
464 


A Pi ei ■jdi 
Ar Xolool 

AF223468 


Homo sapiens 
Homo sapiens 


GLUCOAMYLASE SI (EC 3.2.1.3) 

HSPC263 

AD021 protein 


1404 
1314 


100 
100 


486 
487 
488 


K57S27 
Y19062 " 
Y73373 


Homo sapiens 
Homo sapiens 
Homo sapiens 


alpha 1 (Villi coliaere-i "" 
39k3 protein 

HTRM clone 921803 protein 
sequence. 


4166 

<o*fc f 3 

555 


99 
100 

56 


489 


AL021918 


Homo 
sapiens 


b34IB.i (kruppel related Zinc 
Finger protein 184) 




100 


490 
'491 "■ 


X53773 
U52426 


Rattue 
norvegicus 
Homo -sapiens 


alpha- c large chain f AA 1- 

938) 

GOK 


4675 


0*7 " 


492 


AL359773 


Leishmania 
major 


possible threonine synthase 


1459 


59 
45 


493 


AF226"6"±4 


Homo sapiens 


f erroportini 


2929 


100 


494 


Z93241 


Homo sapiens 


dJ222E13.l (novel protein 
with some similarity to 
Drosophila Krakkn) 


513 


96 


495 


AF036977 


Homo sapiens 


unJcnown 


1 01 o 


100 


496 


U93564 


Homo sapiens 


p40 


133 


45 


497 


Y91405 


Homo sapiens 


Human secreted protein 


357 


100 








sequence encoded by gene 2 
SEQ ID WO: 126 . 




498 


AF069781 


Drosophila 
melanogaster 


Bem46-liJce protein 


653 


43 


499 


Y16601 


Homo sapiens 


Human cell -cycle 
phosphoprotein CECYP-2. 


1658 


98 


50O 


X70944 


Homo sapiens 


PTB- associated splicing 
factor 


3883 


100 


501 
502 


AF027563 
AF282874 


Mus 

musculus 
Homo sapiens 


putative membrane -associated 
guanylate kinase 1 
nectin 3; PRR3 


205 
2856 


36 
99 


503 
504 
505 


AiJ24 9732 
AF208861 
L09708 


Homo sapiens 
Homo sapiens 
ioroo sapiens 


38 protein 
BM-019 

complement component C2 


469 

1629 

4022 


100 
100 
100 


507 

508 1 


X66285 I 
D00189 ] 
] 


Kue musculus j 
Rat tus ] 
lorvegicus 


HC1 ORF 

^a+,K+-ATPase alpha- subunit 


115 
5227 


43 
99 



155 



WO 01/53312 



TABLE 2 



PCT/US00/34263 



CPA 

ID 

NO: 


ACCESS ION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


509 


Y94971 


Homo sapiens 


Human secreted, protein clone 
fal71_l protein sequence SEQ 
ID WO; 148. 


2176 


100 


510 
511 


AB019038 
AB019038 


Homo sapiens 
Homo sapiens 


beta-1,4 mannosyl transferase 
beta-1,4 mannosyl transferase 


781 


77 


512 


AB019038 


Homo sapiens 


beta-1,4 mannosyltransf erase 


1347 
1520 


100 
99 


513 
514 


X84908 
X52851 


Homo sapiens 
Homo sapiens 


phosphorylase kinase 
peptxdylprolyl isomers se 


5729 


99 


515 


AF186084 


Homo 
sapiens 


epidermal growth factor 
repeat containing protein 


650 
3046 


76 
99 


516 


G03602 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7683. 


505 


99 


517 


U04706 


Bos taurus 


50 JcDa protein 


1749 


77 


518 


G00653 


Homo sapiens 


Human secreted protein, SEQ 
ID IJO: 4734 . 


53 0 


100 


519 


AF161475 


Homo sapiens 


HSPC126 


1368 


100 


520 


Y99366 


Homo sapiens 


Human PR01475 ftJN074K^ aminn 
acid sequence SEQ ID NO: 88. 


33 94 


97 


521 


AF266852 


Homo sapiens 


PTPLA 


1295 


1 00 


522 


AE000995 


Archaeoglobu 
s fulgidus 


chromosome segregation 


153 


20 


523 


AF062249 


Homo sapiens 


immunoglobulin heavy chain 


605 


57 


524 


AJ223B30 


Rattus 
norveoicua 


AREl 


2950 


98 


525 


WO 153 5 


Homo sapiens 


Cellular homologue of the 
SV40 large T antigen. 


1276 


83 


526 


AF145658 


Drosophila 


BcDNA.GH10229 


320 


33 


527 


AF112213 


Homo sapiens 


putative Rab5- interacting 
protein 


524 


79 


523 


D49387 


Homo 
sapiens 


siwjf aepenaenc xeujcotriene b4 
1 2 - hydroxy dehydrogenase 


1616 


100 


529 


Y30819 




numctix uccccbCQ OcOlcITI 

encoded from gene 9. 


328 


32 


530 


AL079335 




J- J £, c * ^ \ f £, , x. IvL/cx pcOtciH 

(DKFZP564A032, SBBI88) 
similar to mntiAp TPM-.ftamma 
induce MG11. J 


1059 


99 


531 


Y91506 


Homo sapiens 


Human secreted protein 
sequence encoded by gene 56 
SEQ ID NO: 179. 


X XDU 


98 


532 


X76116 


Caenorhabdit 
is elegans 


carrier protein (c2) 


5 76 


"so" — 


533 


X76116 


Caenorhabdit 
is elegans 


carrier protein (c2) 


506 


50 


534 


X12966 


Homo sapiens 


3-oxoacyl-CoA thiolase 
propeptide (424 AA) 


1972 


100 


535 


Y09267 


Homo sapiens 


flavin- containing 
monooxygenase 2 


2486 


100 


536 


211773 


Homo sapiens 


SRE-ZBP 


2201 


99 


537 


D84224 


Komo sapiens 


methionyl tRKA synthetase 


4741 


99 


538 


D84224 


Homo sapiens 


methionyl tRNA synthetase 


3887 


99 


539 


D84224 


Homo sapiens 


tne thionyl tRNA synthe ta se " 


2 933 


96 


54 0 


D84224 


Homo sapiens 


methionyl tRNA synthetase 


4529 


99 


541 "- 


J03244 


Boe taurus 


H+ ATPase 31kDa subunit (EC 
3.6.1.3) 


848 


77 


542 


Y92514 


Homo sapiens 


Human OXRE-11- 


230i 


99 


543 


AF221712 


Homo 
sapiens 


Smad- and 01 f- interacting 
2inc finger protein 


2151 


61 


544 


AE000919 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


207 


38 


545 


A06669 


synthetic 
construct 


preTGP-betal 

i 


2070 


99 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 

WATERMAN 


* 

IDENTITY 


546 


Y02698 


Homo sapiens 


Human secreted protein 

encoded t>v CJPriA 4 Q rlnnA 

HTPCS60 . 


854 


98 


547 


AF112205 


Homo sapiens 


VJSB-i protein 


2275 


inn 


548 


X60271 


Mus musculus 


c-rel 


2264 


74 


543 


AC016827 


Arabidopsis 
thai iana 


putative GTPase 


810 


42 


550 " 


Y70400 


Homo 
sapiens 


Human cell -signalling 
protein- 2 , 


429 


68 


55X 


AB048365 


Homo sapiens 


NEDD4-lilcp nh"f f*il i t- i n "M/fa«!ei T 


8290 


99 


552 


Y57880 


Homo sapiens 


Human transmembrane protein 
HTMPN-4 . 




95 


553 


AF1198S5 


Homo sapiens 


PR01847 


265 


67 


554 


Ml 723 6 


Homo sapi ens 




1332 


100 


555 


AL078468 


thai iana 






40 


5S6 


AC006963 


Homo sapiens 


similar to BAA77027 
(PID:g4650844) 


515 


44 


557 


AK024487 


Homo sapiens 


FLJO0086 protein 


1623 


98 


558 


M1214Q 


Homo sapiens 


f yCiJC }Jl\Jl*tSXTi f AXX. 


117 


48 


559 


W74825 


Homo sapiens 


Human secreted protein 
encoded by gene 97 clone 

HAORF73 


225 


56 


560 


X56581 






373 


88 


561 


AP00313 6 


Caen02rh.abd.it 
is elegans 


i.wxiL\a wedK s jLTTixxarxty to 

ATI &MDtoKl "i v*r<r m r~\ f~ A 
ail Mrlf JLJJ.I1UJLX1M IIIO t J- X, 


2926 


54 


562 


AL10983 9 




*P UJ -y * A / oitituLng protein ; 


877 


100 


563 


AF1 81640 


Dro s oph i la 
melanogaster 


BcDNA GH0 - 9817 ™" " 


289 


42 


564 


AF052723 


Feline " 

leukemia 

virus 


gpreo 




43 


5^5 ■ 


AF161472 


Homo sapiens 


HSPC123 


439 


44 


566 


Y28817 




L'tJifiD^'i sccxcLcu procein , 


3338 


100 


567 


U09848 


Homo sapiens 


zinc finger protein 


1738 


10 0 


569 


AF155113 


Homo sapiens 


IN I .rtEi.PI 53 cixic. j.y sn 


3603 


93 


57D 


AF155113 


Homo sapiens 


NY- REN- 55 antigen 


3 951 


99 


57X 


AL032821 


Homo sapiens 


tuj33^*j-x ( vsnin j.) 


1821 


98 


572 


M69181 




nuii-tuu5cj.e uiyosin ts 


7350 


99 


573 


M69181 


Homo sapiens 


non- muscle myosin B 


7311 


98 


574 


YS9678 




Secreted protein 108-008- 5-0- 
E6-FL. 


772 


100 


575 


AL365234 


Arabi dops i s 
thai iana 




788 


40 


576 


AL365234 


Arab! dops is 
thai iana 


putative protein 


788 


40 


577 


X06745 


Homo sapiens 


DNA polymerase alpha -subunit 
<AA 1 - 1462) 


7619 


99 


578 


AB041642 


Homo sapiens 


PAR- 6 


1342 




579 


D86984 


Homo sapiens 


similar to yeast adenylate 
cyclase (S56776) 


2446 


100 


580 


AF165124 


Homo sapiens 


gamma -aminobutyric acid A 
receptor gamma 2 


2499 


99 


581 


W88812 


Homo sapiens 


Polypeptide fragment encoded 
by gene 58. 


2339 


99 


582 


U82319 


Homo sapiens 


novel ORF 


342 


100 


"583 


P92219 


Homo sapiens 
(human) 


CR1 protein. 


11425 


99 


584 


AJ223948 


Homo sapiens 


RNA helicase 


6608 


99 


585 


Y08612 


Homo sapiens 


BBkDa nuclear pore complex 
protein 


3874 


99 


566 


Y42384 


Homo 
sapiens 


Amino acid sequence of 
Iv3l0 7. 


1007 


37 


"587 


AF129756 


Homo sapiens 


BAT4 


1873 


98 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 




SMITH- 

Kia TPB MB XT 

SCORE 


IDENTITY 


588 


AK131775 


Homo sapiens 


Unknown 


1929 


-* y 


5S9 


AJ250865 


Homo sapiens 


TESS 2 


2348 


" 10Q~ 


591 


Z988B5 


Homo sapiens 


dJ522J7 . 2 (bromodoma in- 
containing i (similar to 
peregrin, BR140) ) 


4167 


100 


592 


L76571 


Homo sapiens 


nuclear hormone receptor 


1355 


100 


593 


AF091622 


Homo sapiens 


PHD finger protein 3 


9054 


100 


594 


X56807 


Homo sapiens 


desmocollin type 2a 


4443 


100 


595 


AL13 7802 


Homo sapiens 


dJ798Aio.i (novel protein) 


212 


55 


596 


AQ022329 


Homo 
sapiens 


bK4 07F11.2 (adrenergic, beta, 
receptor kinase 2) 


3653 


100 


597 " 


AF22604B 


Homo sapiens 


GL003 


2009 


99 


598 


AJ2 78112 


Homo 
sapiens} 

■> I ** JO JO 

Y4 9635 21- 

Human sdp3.5 
protein • 
[Homo 
sapiens 


putative cell cycle control 
protein 


335 


23 


599 


Y59741 


Homo sapiens 


Human normal ovarian tissue 
derived protein 10. 


1574 


99 


600 


L36531 


nonio sapiens 


integrin alpha 8 subunit 


5386 


99 


601 


Y3 8458 " 


Homo sapiens 


Human secreted protein 
encoded by gene No . 20. 


895 


100 


602 


AF218584 


nvmo Sapiens 


GGA1 


3265 


100 


603 


Y13115 


Homo sapiens 


serine /threonine protein 
kxnase 


5071 


99 


604 


AL132776 






2413 


99 


605 


AL034452 


Homo Qan 1 «»n e? 


aubozuib.i inovei collagen 
triple helix repeat 
conta in i rta rirnh ai »\ 


1979 


100 


606 


Y14494 


Homo sapiens 


aralarl 


3465 


99 


607 


AJ001981 




OXAll 


2603 


100 


608 


X86098 


Homo 
sapiens 


type 5 E1A protein 


3 06 9 


100 


610 


AF163 572 


Homo sapiens 


Forssman glycol ipid 
synthetase 


JL t)03 


99 


611 


AF161503 


Komo sapiens 


HSPC154 


1261 


97 


612 


L41834 


Ensis minor 


nuclear protein 




30 


613 


Y919S4 


Homo sapiens 


Human cytoekeleton associated 
protein 9 (CYSKP-9) . 


3668 


100 


614 


AL022327 


Homo sapiens 


dJ355C18.1 (KrAA0027) 


361 


94 


615 


X85786 


Homo sapiens 




32 03 


100 


616 


Y08319 


Homo sapiens 


jcxnesin-2 "" '"' 


3487 


99 


617 


D12644 


Mus musculus 


KIF2 protein 


JoU? 


97 


618 


U28789 


Mus musculus 


PACT 


5936 


89 


619 


Y35914 


Homo sapiens 


Ext&nrfp.fi human Qprrnfpri 

protein sequence, SEQ ID NO. 
163 . 


1684 


99 


620 


A3046382 


Mus musculus 


protein 




23 


621 


Y00062 


Homo sapiens 


precursor polypeptide <AA -23 
to 1120) 


3440 


99 


622 


AF0682 86 


Homo sapiens 


HDCMD38P 


861 


100 


623 


X98248 


Homo sapiens 


sortilin 


4436 


99 


624 


X61100 


Homo sapiens 


75 kDa subunit NADH 
dehydrogenase precursor 


3734 


99 


625 


S58544 


Homo sapiens 


75 Jcda infertility-related 
sperm protein 


2125 


99 


626 


AF151027 


Homo sapiens 


HSPC193 


582 


93 


627 


XI 4 968 


Homo sapiens 


RI I -alpha subunit (AA 1-404) 


2079 


100 




Y50911 


Homo sapiens 


Human fetal brain cDNA clone 
vb7_l derived protein 


1983 


ioo I 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


smith!- 

SCORE 


IDENTITY 


629 


Y50911 


Homo sapiens 


Human fetal brain cDNA clone " 
vb7_l derived protein 






630 


AF098786 


Homo 
sapiens 


17 beta-hydroxysteroid 
dehydrogenase type VII 


1 *7C4 
X / 34 


100 


631 


AL034555 


Homo 
sapiens 


dJ134019.3 (zinc finger 
protein 151 [pHZ-67)) 


4273 


100 


632 


W74826 


Homo sapiens 


Human secreted protein 
encoded by gene 98 clone 
HAQBT94 . 


794 


96 


633 


AF288288 


Homo sapiens 


HPT protein 


2236* 


. 100 


634 


AF041429 


Homo sapiens 


pRGRl 


823 


99 


63S _ 


X66357 


Homo sapiens 


serine/ threonine protein 
kinase 


1589 


100 


636 


Y11284 


Homo sapiens 


AFX1 


2571 


98 


S37 ■ 


AB004884 


Homo sapiens 


PfCtf-alpha 


3718 


99 


638 


AJ0023O3 


Homo sapiens 


synaptogyrin lc 


1020 


100 


633 


AJ0023O4 


Homo sapiens 


synaptogyrin lb 


1002 


100 


640 


AJ0023O3 


Homo sapiens 


synaptogyrin lc 


933 




641 


D87682 


Homo sapiens 


similar to a C.elegans 
protein encoded .in cosmid 
T26A5 . 


26^7? 


inn 
-LU U 


642 


M14660 


Homo sapiens 


ISG-K54 


2473 


99 


643 


X06661 


Homo sapiens 


calbindin (AA 1-261) 


1358 


100 


644 


AF119900 


Homo sapiens 


PR02822 


185 


^6 


645 


AB031048 " 


Drosophila 
melanogaster 


microtubule associated- 
protein orbit 


738 


27 


646 


AF250 842 


Drosophila 

mclanocraahpr 


multiple asters 


834 


29 


647 


X86691 


Homo sapiens 


Mi- 2 protein 


10110 


99 


64 8 


U67934 




44.9 KDa protein C18B11 
homo log 


827 


96 


649 


AF236061 


C*\'y\r fi f-v 1 -1^*1 ^ n 

viyctoxagus 
cuniculus 


RING- finger binding protein 


3330 


91 


650 


AL034551 


nuuiu t> dpi ens 


dJ914P20.2 (KIAA0784 protein 
similar to Mus musculus 
sctxvi uy — ci©peiiQenc 
neuroprotective protein 
(Adnp) > 


5708 


100 


653 


X14766 




t\ tcucptor axpxict J. 
subuni t 


2388 


99 


654 


AC004614 


Homo sapiens 


similar to f-spond"in proteins 
AB006086 (PID:g2529225) 


3026 


■ 

99 


655 


Y5790S 


Homo sapiens 


Human transmembrane protein 
HTMPN-32. 


608 


99 


656 


Z34975 


Homo sapiens 


ldlCp 


3733 


100 


658 


ALOS0306 


Homo sapiens 


dJ475B7.2 (novel protein) 


1942 


99 


659 


W76734 


Homo 
sapiens 


Human mDia Rho targeting 
protein. 


781 


34 


660 


AF202724 


Homo sapiens 


Sadl unc-84 domain protein 1 


2172 


100 


661 


Z21966 


Homo sapiens 


mPOU homeobox protein 


1529 


100 


662 


AJ242954 


Mus musculus 


dysferlin 


4752 


59 


663 


AF182316 


Komo sapiens 


myof erl in 


6232 


99 


$65 


ALl6lSl£ 


Arabidopsis 
thaliana 


hypothetical protein 


209 


30 


667 


X59303 


Homo sapiens 


valyl-tRNA synthetase 


3393 


99 


668 


Y133S5 


Homo sapiens 


Amino acid sequence o£ 
protein PRO220 . 


3 692 


100 


669 


AB010692 


Arabidopsis 
thaliana 


contains similarity to endo- 

beta-N-acetylglucosaminidase 

gene 


611 


S2 


671 


X56123 


Kus musculus 


talin 


4474 


76 


672 


AB039371 


Homo sapiens 


mitochondrial ABC transporter 
3 


2902 


99 


673 


AF269223 


Homo sapiens 


TCPll 


806 


42 


674 


AF229633 


Mus musculus 


groucho-related protein 4 


4053 


99 


675 


L14463 


Rattus 


' transducin 


3619 


92 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBS R 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 






norvegicus 








676 


AC005757 


Homo sapiens 


R32611_JL 


2779 


100 


67? 


S61069 


Homo sapiens 


reverse transcriptase 
homolog=pol {retroviral 
element } 


252 


65 


678 


AF271388 


Homo sapiens 


CMP-N-acetylneuraminic acid 
synthase 


2273 


100 


679 


X79066 


Homo sapiens 


ERF-1 


1783 


100 


680 


AF118566 


Mus raus cuius 


hematopoietic zinc finger 
protein 


769 


50 


681 


Y51415 


Homo 
sapiens 


Human wild type pKe83 
protein . 


2621 


99 


682 


AL133545 


Homo sapiens 


bA386N14.1 (novel protein 
similar to a dual specificity 
phosphatase) 


700 


68 


683 


Y86214 


Homo sapiens 


Nuclear transport protein 
clone h£b341 protein 
sequence. 


5888 


99 


684 


Y94 952 


Homo sapiens 


Human secreted protein clone 
fhll6_ll protein sequence 
SEQ ID NO: 110. 


354 


98 


685 


AL021878 


Homo sapiens 


dJ257I20.4 (transcription 
factor 20 (AR1) (KIAA0292) 
(isoform 2) ) 


154 


67 


686 


AH000198 


Escherichia 
coli 


orf, hypothetical procein 


628 


100 


687 


M58378 


Homo sapiens 


synapsin I 


3730 


99 


688 


AF039S97 


Homo sapiens 


antigen NY- CO- 31 


508 


98 


689 


U09355 


Oryctolagus 
cuni cuius 


protein phosphatase 2A1 B 
gamma aubunit 


2356 


99 


690 


AF155106 


Homo sapiens 


NY-REN- 3 6 antigen 


265 


50 


691 


AC004774 


Homo sapiens 


Dlx-S 


1542 


100 


692 


X90530 


Homo sapiens 


ragB 


1926 


99 


693 


X90530 


Homo sapiens 


ragB 


1405 


99 


694 


X90530 


Homo sapiens 


ragB 


1590 


85 


695 


G01563 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5644, 


330 


100 


696 


AC011810 


Arabidopsis 
t ha liana 


Putative methionine 
aminopeptidase 


669 


52 


697 


AJ250425 


Rattua 
norvegicus 


Collybistin I 


2455 


98 


698 


AB037901 


Homo 
sapiens 


gene amplified in squamous 
cell carcinoma -1 


5364 


99 


699 


Y99401 


Homo sapiens 


Human PR01327 (UNQ687) amino 
acid sequence SEQ ID NO: 218. 


1386 


100 


701 


AF221712 


Homo 
sapiens 


Smad- and Olf- interacting 
zinc finger protein 


6705 


100 


702 


X83 573 


Homo sapiens 


ARSE 


3184 


99 


703 


AJ243274 


Homo sapiens 


AP-2rep protein 


2078 


99 


704 


Y71262 


Homo sapiens 


Human chondromodulin-liJce 
protein, Zchml . 


1697 


94 


705 


Y71262 


Homo sapiens 


Human chondxomodulin-like 
protein, Zchml. 


1736 


99 


706 


Y41257 


Homo sapiens 


Amino acid sequence of long 
human FAIM. 


1060 


100 


707 


AL022237 


Homo sapiens 


bK119lB2.3 (PUTATIVE novel 
Acyl Transferase similar to 
C. elegans C50D2.7) (isoform 
1) ) 


2030 


100 


To 8 " 1 


AJ006266 


Homo sapiens 


AND-1 protein 


5942 


100 


709 


G01S71 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5652. 


777 


99 


710 


Y08698 


Homo sapiens 


ranbp3 


2849 


98 


711 


Y68770 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-2. 


754 


99 
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SEQ 
ID 
NO: 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 


712 


U93574 


Unmm can't Ans 


putative jpxo\j 


799 


59 


713 


ACO 04531 


Homo sapiens 


Gene with similaity to DEAD 
box helicases 


2715 


99 


714 


D89016 


Homo sapiens 


Neurob 1 as t oma 


538 


48 


715 


Y92175 


Homo sapiens 


Human cardiovascular system 
associated protein tyrosine 
phosphatase 2. 


734 


98 


716 


AL137013 


Homo sapiens 


bA311P8.3 (probable uracil 
phosphoribosyl tranf erase ) 


862 


100 


717 


AB035123 


Mus raus cuius 


GDI alpha/GTla alpha /GQlb 
alpha synthase 


1696 


93 


718 


Y96290 


Homo >P40254 
P40254 25- 
OCT-1984 09- 
APR-1983 
Human IgD. 
[Homo 
sapiens 


Human IGFAM-2 immunoglobulin. 


2345 


85 


719 


X07979 


Homo sapiens 


integrin beta 1 subunit 
precursor 


434 7 


99 


720 


AJ224B19 


Homo sapiens 


tumor suppressor 


2149 


99 


721 


Y07595 


Homo sapiens 


transcription factor TFIIH 


2373 


100 


722 


W41565 


Homo 

sapiens] 

>W41564 

W41S64 03- 

OCT- 19 97 05- 

APR-1996 

Human 

calpain. 

[Homo 

sapiens 


Human calpain. 


1591 


99 


"723 


AF161341 


Homo sapiens 


HSPC078 


1097 


98 


724 


AF187318 


Homo sapiens 


P-box protein Fbx2 


1S07 


100 


725 


AC0067O8 


Caenorhabdit 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB:272876) 


1143 


46 


72 6 


ACOOS708 


Caenorhabdit 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB:Z72876) 


9B8 


46 


72 7 


AC024818 


Caenorhabdit 
is elegans 


contains similarity to Pfam 
family PF00400 (WD domain, 
G-beta repeat), score-81.8, 
E^l . 4e-2Q, N=3 


950 


44 


72 S 


AJ0058 97 


Homo sapiens 


JM5 


831 


47 


729 


Y453 77 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
27 . 


908 


97 


73 0 


G03931 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8012, 


579 


100 


731 


AB012720 


ma sou 


GTP -binding protein 


3865 


76 


732 


W73404 


xiuiltu oaujiBils 


Human secreted protein 
encoded by Gene No . 8 . 


862 


97 


113 


G02650 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6731. 


644 


— — 


734 


AC024 813 


Caenorhabdit 
is elegans 


Hypothetical protein 
YS4FlOAL.a 


152 


24 


73 5 


AL0354 61 


Homo sapiens 


dJ967N21.6 (novel CDP-alcohol 
phosphatidyltransf erase 
family member protein) 


1562 


98 


736 


UO0033 


Caenorhabdit 
is elegans 


similar to S. cerevisiae YJU2 
protein 


605 


41 


737 


AF07903B 


Homo 
sapiens 


arginine- tRNA-protexn 
transferase 1-lp; ATEl-lp 


2733 


99 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

T T1PWT T TV 


738 


AJ131712 


Homo sapiens 


nucleolar RNA-helicase 


2793 


100 


73 9 


AJ133115 


Homo sapiens 


TSC-22-like protein 


2054 


99 


740 


X98258 


Homo sapiens 


M-phase phosphoprotein 9 


9S3 


100 


741 


X98258 


Homo sapiens 


M-phase phosphoprotein 9 


564 


74 


742 


U97191 


Caenorhabdi t 
is elegans 


strong similarity to the YPT1 
sub- family of RAS proteins 


960 


85 


743 


X76057 


Homo sapiens 


phosphomannose i some rase 


2191 


100 


744 


G03209 


Homo sapiens 


Human secreted protein, SEQ 

ID NO: 7290. 


496 


98 


745 


X97064 


Homo sapiens 


Sec23 protein 


4034 


99 


746 


W93946 


Homo sapiens 


Human regulatory molecule 
HRM-2 protein. 


994 


100 


747 


Y73388 


Homo sapiens 


HTRM clone 3376404 protein 
sequence . 


1565 


99 


74B 


M19S29 


Sus 6cro£a 


f ollistatin A 


1906 




749 


AJ24 9457 


Tr i chomonas 
vaginalis 


centrin, putative 


183 


£,<} 


750 


AC004410 


Homo s ap i sns 


foa^9554 1 




J.UU 


751 


AF074968 


Homo sapiens 


P47ING3 protein 


2167 


100 


752 


AF3 (»?9fl A 


Homo sapiens 


transcription specificity 
tactor op J. 


4005 


100 


753 


AB049629 


Homo sapiens 


phospho lysine 

phosphohist idine inorganic 

Y*s< / >~^vy% cull vsVincnhahaca 

IfY^ opjiospiid l. t± pnospna c as e 


1375 


99 


7"d4 '" 


D79205 


rxornQ Sap i bus 


ribosomal protein 1*39 


160 


77 


755 


AB008430 


Homo sapiens 


CDEP 


142 


29 


758 


1*32162 


Homo sapiens 


transcription factor 


574 


80 


759 




Homo sapiens 


RING zinc finger protein 


295 


54 


760 


Y44250 


Homo 
sapiens 


Human cell signalling 
protein- 13 . 


625 


100 


761 


AF218586 


Homo sapiens 


Cide-b 


1136 


100 


762 


U38934 


1 1 ii 

uraXJLUS 

gallus 


nxsnone H2A 


625 


97 


763 


AF226053 


Homo sapiens 


HSKM-B 


606 


32 


764 


X13403 


W nmft ess**"* "i an e 


yct-i ptocean iaa x - 


36^26 


100 


765 


D87446 


Homo sapiens 


Similar to a C. elegans 
protein encoded xn cosmid 

C27F2 fU40419l 


568 


38 


766 


AL023828 


Caenorhabdi t 
is elegans 




■5 /in 


_ 

27 


767 


Y82777 


Homo sapiens 


Huntciii choxrd.i.n ir&ldtGci p3rot6xn 
(Clone dw665 4) . 


255^1 


99 


768 


X92475 


Homo sapiens 


ITBA1 


1429 


100 


769 


Y42752 


Homo sapiens 


Human calcium binding protein 
3 (CaBP-3). 


1426 


100 


770 


X51416 


Homo sapiens 


hormone receptor hERRl (AA 1- 
521) 


i641 


97 


771 


AJ006591 


Homo sapiens 


cysteine-rich protein 


1793 


100 


772 


A08695 


Homo sapiens 


rap2 


935 


100 


773 


Z12173 


Homo sapiens 


N-acetylglucosamine- 6 - 
sulphatase 


2970 


100 


774 


Y919S0 


Homo sapiens 


Human cytoskeleton associated 
protein 5 (CYSKP-5) . 


565 


43 


776 


AL023799 


Homo sapiens 


dJ32 2P7.1 (zinc finger) 


855 


56 


777 


AL023799 


Homo sapiens 


d.T*!?9l>7 1 (vine HntTPrl """" ' 




C<j 

DO 


778 


G018B0 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5961 . 


849 


98 


779 


AJ012590 


Homo sapiens 


glucose 1- dehydrogenase 


4155 


99 


780 


AL078S82 


Homo sapiens 


(U13 0E4.2 (KIAA0796) 


1321 


68 


781 


Z7S955 


Caenorhabdit 
ie elegans 


similar to mitochondrial 
carrier protein 


384 


34 


782 


AL109965 


Homo 
sapiens 


dJ1121G12.2 [SCAN domain- 
containing 1 protein) 


900 


100 


783 


AF061262 


Mus 

musculus 


semaF cytoplasmic domain 
associated protein 2 


1316 


83 


784 


G03873 


Homo sapiens 


Human secreted protein, SEQ 


649 


95 



162 



WO 01/53312 



TABLE 2 
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say 
ID 
NO: 




SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 














785 


Y84441 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein. 


2074 


100 


786 




Homo sapiens 


Human Rab protein, RABP-1, 
protein sequence • 


1048 


99 


787 


Z97029 


Homo sapiens 


ribonuclease HI large subunit 


1548 


99 


TOO 
/DO 


■fin niCloyi 


Homo sapiens 


SRp25 nuclear protein 


962 


S4 


789 


AF024631 


Homo sapiens 


ANG2 ' " - - 


2644 


100 


790 


AJ00571 0 


Rat bus 
norvegicus 


phosphatidyl inositol 3 -kinase 


4508 


97 


792 


V00638 


bacteriophag 
e lambda 


reading frame ealO 


600 


100 


793 


AF049103 


Homo sapiens 


Huntingtin interacting 
protein 


819 


100 


795 


Z26317 


Homo sapiens 


desmoglein 2 


4810 


99 


796 


I V 6 o 84 


Homo sapiens 


Retinoblastoma binding 
protein- 7sequence . 


5080 


99 


'797 ' 


U15155 


Gallus 
gallus 


trypsinogen 


372 


37 


798 


U97189 


Caenorhabdit 
is elegans 


strong similarity to thw 
P13/P14 family of kinases 


227 


28 


799 


AF112201 


•Homo sapiens 


neuronal protein NP25 


1053 


100 


800 


AF234765 


Rattus 
norvegicus 


eerine-arginine-rich splicing 
regulatory protein SRRP86 


958 


63 


801 


AF267852 


Homo sapiens 


placental protein 13 -like 
protein 


743 


99 


802 


AF208851 


Homo sapiens 


BM-009 


766 


80 


803 


Z81097 


Caenorhabdit 
is elegans 


Similarity to Human 
retinoblastoma -binding 
protein RBAP46 yk662dl2.5 
comes from this gene 


152 


27 


804 


G02113 


Homo sapiens 


Human secreted protein, SEQ 
ID NO : 6194 . 


496 


98 


805 


AL121S73 


Homo sapiens 


bA305P22.1 (novel protein) 


1160 


ico 


806 


AC013d83 


Arabidopsis 
thaliana 


putativa GTPase activator 
protein 


264 


30 


807 


AC013483 


Arabidopsis 
thaliana 


putative GTPase activator 
protein 


264 


3C 


808 


AB013B85 


Homo sapiens 


beta-ureidopropionase 


1494 


10 0 


809 


AF078842 


Homo sapiens 


HOTTI* protein 


1581 


99 


810 


AF161421 


Homo sapiens 


HSPC3 03 


2134 


96 


811 


AF2616B9 


Homo sapiens 


DNA polymerase epsilon pi 7 
subunit 


734 


100 


812 


Z74029 


Caenorhabdit 
is elegans 


Similarity to C. elegans 
alcohol dehydrogenase comes 
from this gene 


610 


71 


813 




Homo s ap x e ns 


CU240C2.2 <Core his tone 

TJO R /tlOta fill /13A \ 


324 


100 


814 


W87689 


Homo 

cam on q 


Human HTXFT19 polypeptide. 


1484 


99 


815 


X16282 


Homo 

can \ F*Y\ a 


zinc finger protein (217 AA) 
\x xs zna DuSc xn coaon) 


1109 


99 


816 


Z92539 


Mycobacteria 
tuberculosis 


pth 


300 


36 ~ 


818 


AB030483 


Mus musculus 


B9 


197 


27 


819 


AL1175S5 


Homo sapiens 


hypothetical protein 


321 


94 


820 


AC005328 


Homo sapiens 


R26660_2, partial CDS 


865 


97 


821 


G03951 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8032. 


700 


99 


822 


L34807 


Musca 
domestica 


transposase 


174 


20 


823 


G02928 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7009. 


558 


78 


824 


299531 


schlzosaccha 


caffeine-induced death 


184 | 29 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


825 


AJ006692 


romyces 
pombe 

Homo sapiens 


protein 1 

ultra nigh sulfer keratin 






826 


U23037 


oryctolagus 
cuniculus 


elF- 2Bepsilon 


693 
3406 


68 
90 


827 


G03412 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7493. 


464 


100 


828 


Y30327 


Homo sapiens 


Human secreted protein 
encoded from gene 17. 


113 


44 


829 


Y32199 


Homo sapiens 


Human receptor molecule (REC) 
encoded by Incyte clone 
2022379. 


1012 


100 


830 


W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33 . 


1264 


99 


832 


AB011542 


Homo sapiens 


MEGF9 


2097 


100 


833 


G02639 


Homo sapiens 


Human CPf'TPhiaH t^rrif aj n Ceo 

ID NO: 6720. 


223 


70 


834 


AF119664 


Homo sapiens 


^fcajiBUix^Lxviiaj, rcguxacor 
protein HCNGP 


1574 


100 


835 


AF119664 


Homo sapiens 


transcriptional regulator 


1144 


89 


836 
837 


AF119664 
X12517 


Homo sapiens 
Homo sapiens 


transcriptional regulator 
C protein (AA 1-159 ) 


1448 


94 


838 


U32865 


me 1 anoga s te r 


iinocte protein 


918 
164 


100 
24 


839 
840 


AF067?30 
U27831 


Homo sapiens 
Homo sapiens 


TTiS-aesociated protein TASR-2 
striatum-enriched phosphatase 


631 
2840 


56 
98 


841 
842 


AF286366 
G02309 


Homo sapiens 
nun id scipiens 


CamKI-like protein kinase 
Human secreted protein,. SEQ 
ID NO: S390. 


1796 
278 


100 

98 


843 


AE003615 


Drosophila 
melanogaster 


ade3 gene product 


113 


48 


844 


G01350 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5431. 


629 


100 


845 
847 


U27838 
Y87789 


Mus musculus 
Homo sapiens 


glycosyl -phosphatidyl - 
inositol —anchored protein 
homolog 

Human RBP-26 protein. 


"330S 


9* 


84 8 


AF164 794 


Homo sapiens 


Diff33 protein homolog 


2398 


100 
100 


849 
850 


U41315 
AF192 784 


Homo sapiens 
Homo sapiens 


ZNF127-Xp 
maJcorin 1 


2458 


93 


851 
852 


Y58£28 
Z22968 


Homo sapiens 
Homo sapiens 


Protein regulating gene 
expression PRGE-21 . 
M130 antigen 


2062 
1548 


97 
100 


853 


Z22971 


Homo sapiens 


M130 antigen extracellular 
variant 


6380 


100 

166 


854 


G03362 


Homo sapiens 


ID NO: 7443. 


330 


96 


855 
856 


G03362 
AF285118 


Homo sapiens 
Homo sapiens 


Human cpnT-Qt-QH nr~rtt*A-4n CWrt 

ID NO: 7443 . 
CGi-203 


203 

_ . 


100 


857 


ACO06O69 


thaliana 


^utaLive cj.6«vcigs a no 
polyadenylation specifity 


452 
1383 


100 
55 


858 


AL021546 


Homo sapiens 


Cytochrome C Oxidase 
Polypeptide via -liver 
precursor {EC 1.9.3.1) 


593 


100 


859 


L02956 


Xenopus 
laevis 


ribonucleoprotein 


1664 


85 


860 


AF201947 


Homo sapiens 


MEK binding partner l 


616 


100 


861 


L31783 


Mus musculus 


uridine kinase 


1266 


92 


862 


AF161472 


Homo sap a ens 


HSPC123 


602 


73 


863 


Z49068 


Caenorhabdit 
is elegans 


mitochondrial carrier protein 


370 


43 


"864 


AF254108 


Homo sapiens 


Lumor necrosis factor type i 


3559 


99 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


WATERMAN 
SCORE 


% 

-HJfiJN 111 X 








receptor associated protein 






865 


AE001530 


Helicobacter 
pylori J99 


putative 


230 




866 


XS7807 


Homo sapiens 


immunoglobulin lambda light 
chain 


699 


91 


867 


AL031673 


Homo sapiens 


dJ694B14.1 (PUTATIVE novel 
KRAB bocx protein with 18 C2H2 
type Zinc finger domains) 


4066 


99 


868 


Y11652 


Homo sapiens 


phosphate cyclase 


238 


100 


869 


AF192968 


Homo sapiens 


high-glucose- regulated 
protein 8 


3041 


99 


870 


AB020648 


Homo sapiens 


KIAA0841 protein 


3237 


99 


871 


AL031427 


Homo sapiens 


dJ167A19.1 (novel DrotHnl """ 


1608 


100 


872 


AF151534 


Homo sapiens 


core histone macroH2A2.2 


1866 


100 


873 


AL021331 


Homo sapiens 


dJ3 66N23.1 (putative C\ 
elecrans UNC-93 (nrnf f*-» -n 1 

C46F11.1) I/IKE protein) 


1129 


100 


0 74 


X14608 


Homo sapiens 


propionyl-CoA carboxylase 


•33 / 3 


100 


875 


AL117334 " 


Homo sapiens 


dJ6 87Fll.l (novel protein 
(part of translation of cDNA 
DKFZp434N061, Em:AL110249) ) 


306 


100 


876 


X79489 


Saccharomyce 
a cerevisiae 


E-925 protein 


446 


35 


877 


Y53001 


Homo sapiens 


Human secreted protein clone 
dn834 l protein sequence SEQ 
ID NO: 8. 


ft! T 
Oil 


100 


878 


AF2 31064 


Homo sapiens 


CHMP1 . S 


957 


100 


879 


X79417 


Sug acrofa 


40S ribosomal protein S12 


687 


100 


880 


AF001317 


Saccharomyce 
s cerevisiae 


Soilp 


478 


"28 


681 


Y87275 


Homo sapiens 


Human signal peptide 
containing protein HSPP-52 
SEQ ID NO: 52. 


2547 


100 


882 


M14036 


Homo sapiens 


Cl-inhibitor 


598 


77 


883 


AB041261 


Homo sapiens 


calcium- independent 
ohosoholinase a? 


2903 


100 


884 


AF020313 


Mus raus cuius 


proline -rich protein 48 


999 


84 


885 


Y10936 


Homo sapiens 


Hypothetical protein 


1104 


99 


885 


AF073997 


Mus niu 3 cuius 


uiyyv-uijuicn j.n reiaceo protein 
1 


B66 


36 


887 


Y57893 


Homo sapiens 


Human transmembrane protein 
HTMPN-17 . 


1099 


94 


888 


AL117635 


Homo sapiens 


hypothetical protein 


929 


99 


889 


AF210317 


Homo sapiens 


f ft rill (*J)h^ Ira y-rl ' - " — ■" — 

GLUT 9 


2046 


99 


890 


Y36031 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


583 


100 


891 


Y36031 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


192 


57 


892 


AF237S31 


Homo sapiens 


ubiquitous tropomodulin U- 
Tmod 


1798 


1UU 


893 


AF090929 


Homo sapiens 


PR00477p 


653 


99 


894 
895- 


AL031228 


Homo sapiens 


dJ1033B10.2 <WD40 protein 
BING4 (similar to S. 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F28D1.1) 


3196 


loo 




AL031228 


Homo sapiens 


dJ1033B10.2 (WD40 protein 
SING 4 (similar to S. 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F28D1.1) 


2825 


96 


B96 


AF1 71102 


Homo sapiens 


retinal degeneration B beta 


13 02 


§5 


897 


AE003551 


Drosophila 
melanogaster 


CG18176 gene product 


€33 


33 
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TABLE 2 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 




LK1 ir 1 XUJN 


SMITH - 
WATERMAN 
bUUKh 


IDENTITY 


838 


AJ237946 


Homo sapiens 


DEAD Box Protein S 


2443 


100 


899 


Z971S4 




EKE2 




100 


900 


Z97184 


Homo sapiens 


KKE2 


409 


98 


901 


AJ245587 


Homo sapiens 


i'c a. ty^Jc ^iJit. tinker 




100 


902 


AF091034 


Homo sapiens 


GTP -binding protein RAB22A 


1011 


100 


903 


R95953 




Tfiilra Y\/n t" "i r* r>*»1 1 it YVM.rf- Vi 
CiU^a^yutJKr CCXJL y iuWLU 

inhibiting factor. 


4 14 


96 


904 


L04733 


Homo sapiens 


kinesi n licrht" nViaHn 

*\ *ilt<3Ail (LX^jllL. UUQJill 


1 QIC 


72 


905 


AE003540 


Drosopnila 
melanogast er 


CG10984 gene product 


446 


33 


9 06 


M55542 


Homo sapiens 


guanylate binding protein 


2993 


98 


907 


M55542 


Homo sapiens 


guanylate binding protein 
isofortn I 


2901 


96 


908 


W8408S 


Homo sapiens 


Human membrane fusion protein 
WDProl. 


1889 


100 


909 


AF168676 


Homo 
sapiens 


TNF intracellular domain - 
interacting protein 


647 


100 


910 


AB029150 


Homo sapiens 


KRAB zinc finger protein 
HFB101L 


2196 


100 


911 


G02871 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6952. 


521 


100 


912 


G03162 


Homo 3apiens 


Human secreted protein, SEQ 
ID NO: 7243. 


387 


87 


913 


AJ243721 


Homo 
sapiens] 
>Y92508 
Y92508 13- 
APR-2000 06- 
OCT-1998 
Human OXRE- 
5 . [Homo 
sapiens 


dTDP-4-keto-6-deoxy-D-glucose 
4-reductase 


1710 


100 


914 


U24189 


Caenorhabdit 
is elegans 


hypothetical protein 1207-1; 
Method : conceptual 
translation supplied by - 
authors 


244 


41 


915 


Y02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein. 


843 


99 


915 




Archaeoglobu 
s fulgidus 


dinitrogenase reductase 
act i va b i ng glycohydr ol a s e 


171 


26 


913 


M23159 


Cricetus 
on cetus 


DHFR-coamplif ied protein 


163 


30 


919 


L12018 


Caenorhabdit 
■is eicyans 


putative 


1232 


41 


920 


AF102177 




tumor antigen SIiP~8p 


1260 


97 


921 


AL096712 


Homo sapiens 


dJ744I24.2 (similar to a 
novel human gene mapping to 
Activator) 


1017 


78 


922 


A.Til fi 1 


Arab i dop sis 
t ha liana 


putative WD-repeat protein 


866 


42 


923 




Arab A dop sis 
thai i ana 


putative WD-repeat protein 


442 


36 


924 


w3 / UUl 


Caenorhabdi t 
i s el ecans 


similar to 


605 


51 


925 


X71978 


Mus musculus 


Fit 


1503 


95 


926 


K92288 


Drosophila 
melanogaster 


beta- spectrin 


290 


51 


927 


Y27575 


Homo sapiens 


Human secreted protein 
encoded by gene No . 9 . 


1392 


100 


928 


Y22499 


Homo sapiens 


Human secreted protein 
sequence clone mh703_l . 


2249 


100 


93 0 


AJ224326 


Homo sapiens 


r ibu 1 ose - 5 - phospha te - 
epimerase 


912 


100 


931 


U28991 


Caenorhabdit " 


coded for by C. elegans cDNA 


666 


55 
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TABLE 2 



PCT/USOO/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES " 


DESCRIPTION 


SMITH- 

t*f ^ *T*T7 tut TV KT 

0UJKC1 


IDENTITY 






is elegans 


cm2ic7 






932 


AL030065 


Homo sapiens 


hypothetical protein 


210 


... 
25 


93 3 


G01384 


Homo sapiens 


Human secreted protein, SEQ 
ID NO* 5965 


767 


98 


93 4 


AJ276485 


Homo sapiens 


integral membrane transporter 


1200 


ioo 


93 5 


AL035681 


Homo sapiens 


dJ756G23.3 (novel protein 
transcriptional repressor) 


1142 


80 


93 6 


AB026808 


Mus musculus 


synaptotagmin XI 


2142 


95 


937 


AB015345 


Homo sapiens 


HRIHFB2216 


2601 


99 


93 8 


AOS 1 £.**. 


Homo sap i ens 


URF2 


498 


100 


939 


W89024 


Homo sapiens 


Polypeptide fragment encoded 
by gene 156 . 


1487 


100 


940 


G04047 


noiTio oapisns 


Human secreted protein, SEQ 
ID NO: 8128. 


117 


100 


941 




Homo sapiens 


putative HIV-i infection 


452 


100 


942 


AC024200 


2*s elegans 


contains similarity to 
ocvciai ^xii<_. Linger pruucins 

but riot" t"f> t" Vifi *7 "f n r* ft ■J r>n e» t- 

domains 


350 


69 


943 


AF129756 


Homo sapiens 


G5c 


273 


100 


944 


K23765 


Rattue 
norveg i cu s 


a lpha - 1 r opomyos i n 


133 


96 


945 


AC009917 


Arabidopsis 
t ha liana 


Contains similarity to 


583 


47 


94 6 


AF223468 


Homo sapiens 


AD021 protein 


551 


44 


947 


AF055473 


Homo sapiens 


GAGE -8 


273 


51 


948 


X75756 


Homo sapiens 


protein kinase C mu 


2019 


68 


949 


API 41 ace 


Mus musculus 


corcnin-2 


2300 


93 


950 


Y36729 


Homo 
sapiens 


Human pgi protein sequence. 


1861 


99 


951 




Homo sapiens 


Human low density lipoprotein 
binding protein IiBP-2. 


282 


67 


9S2 


AB016881 


Arabidopsis 
t hal i ana 


gene_id:MXC17 .7- 


203 


46 


9S3 


Y01785 


Homo sapiens 


Human ubi qui tin -conjugating 
enzyme >Y25341 Y25341 01-JUL- 
1999 12 -AUG- 199 8 Human NCE-2 
protein . 


365 


100 


954 


AF145615 


T^T'rte/'vr-vViT 1 j=» 
i-l-i- uaupnxJla 

melanogaster 




823 


46 ~" 


955 


U09410 




zinc finger protein ZNF131 


2483 


99 


956 


UD9410 


Homo Rani pna 


4^hg Linger protein oJS rlJ J. 


1853 


99 


957 


AF195623 


Homo sapiens 


cholinephospho transferase 1 
alpha 


2126 


99 


958 


X94917 


DiTosophil a 
melanogaster 


ucaU'clcvdLeu S A.JU t CoSlOIl ill 

0.9 fcb 


155 


32 


959 


U54807 


Rattus 
norvegicue 


GTP-binding protein 


1167 


j / 


960 


AF05B807 


Bos taurus 


GTP-bindincr orotein rsh 


606 


(17 


961 


G03244 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 73 25. 


471 


100 


962 


AF078850 


Homo sapiens 


steroid dehydrogenase homolog 


583 


40 


963 


AP0017S4 


Homo sapiens 


transient receptor potent ial- 
related channel 7, a novel 
putative Ca2+ channel protein 


317 


30 


964 


AL03S419 


Homo sapiens 


dJ1100H13.I (putative novel 
protein) 


1129 


100 


965 


X61381 


Rattus 
rattus 


interferon- induced protein 


202 


46 


966 


D38169 


Homo 
sapiens 


inositol 1,4, 5-trisphosphate 
3-kinase isoenzyme 


3278 


100 


967 


AL031432 


Homo 
sapiens 


dJ4fi5N24.2.1 (PUTATt^E novel 
protein) (ieoform 1) 


893 


100 
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TABLE 2 



PCT/US00/34263 



SEQ 
10 
NO: 


ACCESSION 
NUMBER 




DESCRIPTION 


SMITH- 
WATERMAN 


IDENTITY 


968 


U79275 


Homo sapiens 


unknown 


""611 


100 


963 


AJ011306 


sapiens 


CTUanill£ nnrl pnl - i Hp ^ vr*Visinrro 

factor (long isoform) 


— ™ 


99 


970 


AF281134 


Homo sapiens 


exosome component Rrp4 6 


1186 


100 


971 * 


U53336 


Caenorhabdit 
is elegans 


weak similarity over a ohbrt 
region to myosin heavy chain 


536 


23 


972 


AC018749 


Leishmania 
major 


L8840.12 


589 


S3 


973 


AF188504 


Mus musculus 


hNV 


544 


85 


974 


U25801 


Homo sapiens 


Taxi binding protein 


852 


98 


975 


AF049523 


Homo sapiens 
j 


hunting tin-interacting 
protein HYPA/FBP11 


1 "a on 


y / 


976 


AF161530 


Homo sapiens 


HSPC182 


1040 


100 


977 


G04020 


Homo sapiens 


Human secreted protein, SEQ 
ID NO* 8101 


626 


100 


978 


AF164797 


Homo sapiens 


ribosomal protein L17 isolog 


908 


100 


979 


U94991 


laevis 


transcription factor XLM01 


795 


97 


980 


S73775 


nunio sapiens 


calmitine; calseques trine 


2029 


100 


981 


Y94888 


Homo 
sapiens 


Human protein clone HP01462 . 


2501 


100 


982 


AJ243Z91 


Homo sapiens 


heat shock protein 


827 


96 


983. 


X65020 


Bos taurus 


PSST subunit of the NADH: 
ubiquinone oxidoreductasa 
complex 


964 


85 


QUA 


A»J24S207 


Rnodococcus 
sp. AD45 


putative racemase 


351 


43 


965 


6.3UU93 


Homo sapiens 


basic transcription factor 2, 
35 kD subunit 


1576 


99 


986 




Homo sapiens 


contains two glutamine rich 
domains, three zinc -finger 
domains, and matrin 3 
homologous domain 3 (MH3) 


4697 


99 


987 


AF227258 




RPGR- interacting protein-1 


1262 


38 


988 


AL02223B 


Homo sapiens 


dJ1042K10.2 (supported by 

wfiWoUUJl , r iyti,rih.2y and tjfclNJEWISH ) 


4048 


99 


969 


AL022238 




uuj.v«i*&iu \cupporc6u ny 
GENSCAN. FGENES and GETjfwtcij?} 


2321 


99 


390 


AF161426 


Homo sapiens 


HSPC308 


r 448 


92 


99X 


AF161426 


Homo sapiens 


KSPC3Q8 


448 


92 


992 


AF161426 ' 


Homo sapiens 


HSPC308 


453 


92 


993 


AL023859 


Schizosaccha 

romycea 

pombe 


•"•■"a tj^it X Ally cilUUilUv 1 B3S6 

subunit 


172 


42 


994 


AL049631 


Homo sapiens 


OJ513M9.1 (novel Homeobox 
domain protein) 


241 


47 


995 


AC005253 


Homo sapiens 


R26445 1 


902 


100 


996 


AF265206 


Homo sapiens 


ri\JU± ISOIDITll A 


974 


100 


997 


AJ2482B5 


Pyrococcus 
abyssi 


sarcosine oxidase, subunit 


195 


28 


998 


AE0Q3641 


Drosophila 
rcelanogaster 


nva . u£>uu34 j. . j gene produce 


218 


58 


999 


W69343 




Secreted protein of clone 


1340 


93 


1000 


AY007135 


Homo sapiens 


similar to bovine ADP/ATP 
translocase Tl mRNA with 
Gen Bank Accession Number 
M24102 .1 


1543 


100 


1001 


Y73381 


Homo sapiens 


HTRM clone 1877278 protein 
sequence . 


1658 


10O 


1002 


AF208844 


Homo sapiens 


BM-002 


428 


lOO 


1003 


AE004944 


Pseudomonas 
aeruginosa 


hypothetical protein 


134 


35 


1004 


AL031431 


Homo sapiens 


dJ462023.2 (novel protein) 


2058 


10O 


1005 


S45367 


Can is 
familiaris 


centractin 


1949 


100 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 




bMITH- 
WATERMAN 
SCORE 


~1 K 1 

IDENTITY 


1006 


S45347 


Can is 

f amiliaris 


centractin 


1315 


98 


1007 


AB022158 


Mus 

mus cuius 


chaceronin rnnha i n i "nn TCP — 1 

epsilon subunit 


264 9 


y o 


1008 


Y76332 




protein encoded by gene 38. 


12 82 


97 


1009 


AB011414 


Homo sapiens 


Kruppel— type zinc finger 
protein 


1671 


58 


1010 


Z63218 


is elegans 




269 


67 


1011 


AB011414 




iv-t uppei - type zinc linger 
protein 


1671 


58 


1012 


Z14000 




RINGl " 


2017 


100 


1013 


G02841 


Homo sapiens 


Human secreted protein, SEQ 


332 


93 


1014 


AF145659 


Drosophila 
melanogaster 


BcDNA . GH1 0333 


1244 


52 


1015 


Y02860 


HomO sapiens 


cxoymciit oi liuitiarz secrecect 
protein encoded by gene 65. 


664 


67 


1016 


Y02591 


Homo sapiens 


n iiumaii yiuycoLcLuiie receptor 
complex p23-like protein. 


772 


97 


1017 


Y9944S 


Kortio fianien^i 


Human DPOl *7c:q /rTwno "> o \ — =>tt» -I 

nuuid-n eK\jx /:>i> \vj.wyo.J^j amino 

arid RpffllPnrA Qprt rr\ MO.^TA 


2323 


100 


1018 


X67250 


Rattus 
norvegicus 


n-chimaerin 


1710 


97 


1019 


AP183417 


Homo 
sapiens 


microtubule- associated 


631 


100 


1020 


AF1G4795 




sex-regulated protein j anus -a 


674 


100 


1021 


AF19062S 


coturnix 


qugx - J. 


638 


96 


1022 


AL133363 


thaliana 


putative protein 


155 


37 


1023 


AB034912 


nullum &a.fatJLt£ilO 


WD- repeat like sequence 


2483 


100 


1024 


AY007091 


Homo sapiens 


similar to Homo sapiens 
mammalian inositol 
hexakisphosphate kinase 2 
vxfQivzj mtuNA witn Ge 


2243 


100 


1025 


X69910 


Homo sapiens 


P63 protein 


2958 


99 


1026 


%/ M V t J u 


Homo sapiens 


CAGF9 


1657 


100 


1027 


AB029333 


roretzi 


jirJrJQ l-l 


1048 


54 


102B 


AB032931 


Homo sapiens 


ubiqui tin- conjugating enzyme 
isolog 


1045 


100 


1029 


G01797 


Homo sapiens 


Human secreted protein, SEQ 


749 


98 


1030 


G01797 


Homo sap i e ns 


Human Kf»(" , 7"f=»t-*»H rvrnt-a^ n QlPO 

ID NO: 5878. 


749 


98 


1031 


AF193 795 


Homo sap i ens 


vacuolar sorting protein 
VPS29/PEP11 


960 




1032 


AJ222968 


Mus mus cuius 


L-periaxin 


120 


30 


1033 


Z81317 


Schizosaccha 

romyces 

pombe 


DNA2-NAM7 helicaqp f^iri-i 1 v 
protein 




31 


1034 


Y41519 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 75. 


1321 


99 


1035 


AJ276Q04 


Mus musculus 


Paxneb protein 


1709 


77 


1036 


AF025459 


Caenorhabdit 
is elegans 


H14A12.3 gene product 


190 


30 


1037 


U37251 


Homo sapiens 


Description: KRAB zinc linger 
protein; this is a splicing 
supplied by author 


196 


43 


103B 


W74580 


Homo 
sapiens 


Human membrane protein 
BA0306. 


1921 


97 


1039 


U88173 


Caenorhabdit 
is elegans 


weak similarity to 
Arabidopsis thaliana 
ubiqui tin- like protein 8 


331 


80 
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TABLE 2 
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SEQ 
ID 
NO: 


ACCESSION" 
NUMBER 


SPECIES 


UCiooa J. Jr J. 


SMITH- 
WATERMAN 


% 

IDENTITY 


1O40 


AF290204 


Homo sapiens 


blood group carrier molecule 
DOK1 


1637 


go 


1041 


Y96730 


Homo 
sapiens 


PROS39, a Costal -2 homologue. 


162 


22 


1042 


AF1406B3 


Mus musculus 


F-box protein FWD2 


2397 


98 


1043 


AF151023 


Homo sapiens 


HSPC189 


1104 


100 


1044 


AF181631 


Drosophila 
melanogaster 


BcDNA . GHO 4929 


204 


3 7 


1045 


Y77985 


Homo sapiens 


Human collectin amino acid 
sequence . 


1940 


100 


1046 


AJ243972 


Homo sapiens 


6 -phosphogluconolactonase 


1317 


100 


1047 


AB035863 


Homo sapiens 


ATP specific succinyl CoA 
synthetase beta subunit 
precursor 


2324 


99 


1048 


AL034550 




riiTll 84 F4 2 ( nnvp 1 rtv"r»t-#»in 

similar to nucleolar protein 
4 {N0L4) (NOLP) ) 


am 


. 


1049 


AF163 825 


Homo sapiens 


pre-B lymphocyte protein 3 


634 


100 


10SO 


AF201949 




^ AC >- -» Knonma 1 vwn^ a * m T *3 f\ 

oua iiooaomal proC6in JjjQ 

isolog 


868 


100 


1051 


AF190624 


Mus musculus 


mdgl - i 


23 6 


85 


1052 


AFOm^Q 

W"'UUJ 3 j& J 


Drosophila 
me 1 an ogas t e r 


CG6151 gene product 


160 


44 


1053 


GO 1191 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5272 . 


646 


98 


1054 


AXjXDA /do 


Neisseria 
meningitidis 


Glu- tRNA {Gin} 

ami dot ransf erase subunit A 


682 


44 


1055 


AF181856 


Rattus 
norvegi cus 


tRNA eelenocysteine 
associated protein 


1525 


99 


1056 


U V J O ? 7 


Cnl aroydomons 
s 

reinhardtdti 


Mrl9,000 outer arm dynein 
light chain 


244 


34 


10S7 


AF159141 




breast cancer metastasis** 
suppressor 1 


663 


53 


1058 


AF23 0929 


sapiens 


netaLinocyte annexin-l 1K€ 


1710 


99 


1059 


AJ270952 


Homo sapiens 


putative membrane protein 


1363 


100 


1050 


AF224263 


Heterodontus 
f rancisci 


HoxD8 "~ - "' 


742 


83 


1061 


X63417 


Homo sapiens 


IRLB 


1037 


100 


1062 


AL079345 


3 t rep t omy ces 
coelicolor 
A3 (2) 




143 


27 


1063 


Y71112 


Homo sapiens 


Human Hydrolase protein- 10 
(HYDRL-10) . 


2547 


100 


1064 


AF263 614 


Homo sapiens 


acetyl-CoA synthetase 


3493 


99 


1065 


Y13 356 


Homo s ap i ens 


Amino acid R^miPiira r\f 

protein PR0221. 


1363 


100 


1066 


AC006153 




Q i_Tnil33T tO Ami i "~ ^"V a i rii i a 

GTP-bindilXCf orohein * eimil ^t* 
to AE000771 <PID:g29fi4292) 




98 


1067 


Y1893 0 


Sulfolobus 
solfataricus 


hypothetical protein 


162 




1068 


R6S969 


Homo 

sapiens T98G 


polypeptide . 


887 


100 


1069 


Y07964 


Homo sapiens 


Human secreted protein 
fragment 


863 


96 


1070 


AF177476 


Rattus 
norvegicus 


CDK5 activator-binding 
protein 


1995 


86 


1071 


AF245505 


Homo sapiens 


adiican 


3109 


99 


1072 


U92794 


Mus musculus 


alpha glucosidase II, beta 
subunit 


147 


36 


1073 


Q03889 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7970. 


69B 


98 


1074 


U15779 


Homo sapiens 


p70 


380 


28 


1075 


Y13392 


Komo sapiens 


Amino acid sequence of 


1271 


91 



170 



WO 01/53312 



TABLE 2 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH^ 
WATERMAN 


TS 

IDENTITY 








protein PR03 28. 






1076 


AF161457 


Homo sapiens 


HSPC339 


571 




1077 


Y79509 


Homo sapiens 


Human carbohydrate*- associated 
protein CRBAP-5 . 


2151 


98 


1078 


AF223466 ~ 


Homo sapiens 


HT015 protein 


""831 


66 


1079 


AL132965 


Arab i dop s A s 
t ha liana 


CUtative WD- & 0 y^r->P» A h^riY-nl- o i n 






1080 


AB024937 


Homo sapiens 


LUNX 


1284 


100 


1081 


Y14768 


Homo sapiens 


protein 


579 


100 


1032 


AF016416 


Caenorhabdit 
is el^eran^ 


F29A7.4 gene product 


141 


31 


1083 


L13291 


Homo sapiens 


ADP-ribosylarginine hydrolase 


802 


45 


1084 


AB041541 


nua hiudluxus 


unnamed protein product 


1S1 


44 


1085 


G01922 




nutiidn fiRcrcteu protein/ ^iiU 
ID NO: 6003 . 


202 


97 


1086 


AB030814 


Homo sapiens 


c* *\ » v x v ' ^Ji.uLci.u xi^JinOJL 




833 


100 


1087 


AF151638 


Homo sapiens 


phosphatidylcholine transfer 

v^i*y-/-\ f- a ■{ n 


114 2 


100 


1088 


Y84432 


Homo sapiens 


Amino acid sequence of a 

hitman DMA_Ti<r»e*/*\'**'^ ^ t 

protein. 


2783 


100 


1089 


Y94867 


Homo 
sapiens 


Human protein clone HP10563 . 


613 


100 


1090 


AK023 9B2 


Homo sapiens 


unnamed protein product 


130 


49 


1091 


AB041586 


Mus musculus 


unnamed protein product 


1103 


81 


1092 


Y71277 


Homo sapiens 


Human Zlipo3 protein. 


606 


100 


1093 


£734 973 


Mus musculus 


prote in tyrosine phosphatase - 
like 


1131 


95 


1094 


Y66677 




Membrane -bound prote in 


S22 


56 


1095 


Y87276 


noflio s «p ens 


Human signal peptide 
containing pro t em HSPP— S3 


1029 


99 


1096 


Y87276 


Homo c »n i *»ri ^ 


n mucin siynai peptide 
containing protein HSPP- 53 

Ou^ XL/ 3 J * 


863 


98 


1097 


AF161455 


Homo sapiens 


HSPC337 


742 


98 


1098 


1*60029 


Caenorhabdit 

JL o X CVjailU 


similar to thioredoxin 


242 


39 


1099 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


1321 


99 


1100 


AJ005866 


Homo sapiens 


Sqv-7-liJce protein 


1118 


99 


1101 


JTV-J UU30CS 


Homo sapiens 


Sqv-7-like protein 


891 


99 


1102 


AJ005&££ 


tX<-srrtr"\ saT^^ A no 

nuuiu sctpxcilo 


Sov-7-like protexn 


1016 


99 


1103 


ALII 0244 


Homo sapiens 


hypothetical protein 


299 


31 


1104 


**«. b^ti^t 


melanogaster 


brakeless-B 


147 


52 


1105 


AL031010 


Homo sapi £113 


rfiTi*?5T?9A 1 f DT1TATT\/T? nrara 1 ! 
Uvv«<f At • 1 (rUlAllvCi JlDVci. 

protein similar to C. elegans 
C02C2 .5) 


969 


100 


1106 


U28016 


Mus musculus 


parathion hydrolase 
fohosDhotriesteras^) -vp] ^ H*=»n 
protein 


1624 


87 


1107 


AJ278150 


Homo sapiens 


putative lipid kinase 


2207 


99 


1168 


G03733 


Homo sapiens 


ID NO: 7814 . 


a act 


98 


1109 


AF217287 


Drosophila 
melanogaster 


G protein RhoBTB 


834 


54 


1110 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


941 


48 


1111 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


1331 


51 


1112 


AF176704 


Homo sapiens 


F-box protein FBX9 


2027 


99 


1113 


AF182076 


Homo 
sapiens 


glioma tumor suppressor 
candidate region protein 2 


2418 


100 


1114 


G04039 


Homo sapiens 


Human secreted protein, SEQ 


475 


96 



171 



WO 01/53312 



TABLE 2 
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SEQ 
XD 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








ID NO: 8120. 






1115 


AF229439 


Mus musculus 


zinc finger" protein 289 


1697 


91 


1116 


L40357 


Homo sapiens 


thyroid receptor interactor 


509 


100 


1117 


L40357 


" Homo s ap i e ns 


thvroid receotor "i nhprartrvr 


404 


85 


1118 


A12155 




Human X5L cDMA ' 


1673 


100 


1119 


AL161542 


Arabidopsis 
tha liana 


i3omerase liJce protein 


607 


53 


1120 


AL023754 


Homo sapiens 


dJ272L16 1 (Rat 
Ca2+/Calmodulin dependent 
Protein Kinase LIKE protein) 


2341 


98 


1121 


Y57901 


Homo sapiens 


Human transmembrane protein 
KTMPN-25- 


321 


36 


1122 


Z14122 


Xenopus 
laevis 


XLCL2 


455 


77 


1123 


AF225418 




1 ipase 


1531 


97 


1124 


Y06518 


Homo s ap i e n s 


7,pr\ RTPaRP l' r» 1- afar* {* 4 nn 

protein ZIP . 


3227 


100 


1125 


AL035690 




auzuzizi.i inovei protein) 


952 


100 


1126 


AJQ00217 


Homo sapiens 


CLIC2 


1286 




1127 


AB0305G5 


Mus musculus 


UBE-1C2 


1069 


79 


1128 




Homo sapiens 


HTRM clone 1427838 protein 
sequence . 


874 


100 


1129 


Y78941 


Homo sapiens 


Cyclophi-lin-type pep t idyl 
prolyl cis/ trans isomerase 
amino acid sequence . 


877 


100 


1130 


AIi023553 


Homo sapiens 


dO*347Hl3.4 (novel protein) 


557 


100 


1131. 


Y91945 


Homo sapiens 


Human chape rone protein 6 
(HCHP-6) . 


1408 


100 


1132 


Z68197 


Schi zos accha 

romyces 

pombe 


putative nuclear pore protein 


596 


39 


1133 


aO Oil? / 


Schi zos accha 
romyces 


putative nuclear pore protein 


389 


35 


1134 


AF180681 


Homo sapiens 


guanine nucleotide exchange 
factor 


3597 


100 


1135 


AF079765 


Mus musculus 


enhancer of polycomb 


264 


41 


1136 


M62419 


1*1 LI o CUU.£3 vUXUS 


clathrin-associated protein 


2189 


99 


1137 


AJ006219 


Drosophila 


clathrin-associated protein 


1254 


78 


1138 


Y76218 


Homo sapiens 


Human secreted protein 
encoaea »y gene 33 , 


440 


98 


1139 


WB8104 


Homo 
sapiens 


A Rab protein designated 

TJRAT5 Q - 9 


1065 


99 


1140 


Y13401 


Homo sapiens 


Amino acid sequence of 


3979 


98 


1141 


W85026 


Chimeric - 
Homo sapiens 


Green fluorescent protein- 
*ap/u iusion proaucn . 


3309 


100 


1142 


Y13402 


Homo sapiens 


Amino acid sequence of 

p L U L C 1 il JtaUJXU • 


1694 


99 


1143 


G03875 


Homo sapiens 


Human secreted protein, SEQ 
iu uu: /job. 


660 


99 


1144 


Y12917 


Homo sapiens 


Amino acid sequence of a 
human secreted peptide. 


750 


98 


1145 


Y12917 


Homo sapiens 


Amino acid sequence of a 
human secreted peptide . 


1096 


100 


1146 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOIiOG 
(PROTEIN DXF3 4) ) 


1233 


100 


1147 


AL022157 


Homo sapiens 


SPIN (SPINDLIN H0M0L0G 
(PROTEIN DXF34 ) ) 


1233 


100 


1146 


G02548 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6629. 


3 70 


93 


1149 


Y73338 


Homo sapiens 


HTRM clone 2019742 protein 
sequence . 


1492 


100 


1150 


W74841 


Homo sapiens 


Human secreted protein 
encoded by gene 113 clone 


228 


55 
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SEQ 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 




ID 


NUMBER 






WATERMAN 


IDENTITY 


NO: 








SCORE 










HEAAR60. 






1151 


AF044201 


Rattus 
norvegicus 


neural meirbrane protein 35; 
NMP35 


1570 


92 


1152 


AF156774 


Homo 
sapiens 


lysophosphatidic acid 
acy 1 t rans f e r a s e - gamma 1 


1855 


99 


1153 


ALII 8501 


Homo sapiens 


dJ1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em:AL050069 ) ) 


872 


64 


1154 


AF131852 


Homo sapiens 


Unknown 


473 


100 


1155 


Y41705 


Homo 
sapiens 


Human PR03S2 protein 
sequence . 


1381 


97 


1156 


G04036 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: S117. 


607 


99 


1157 


AF11244 4 


lAipinus 
luteus 


li-asparaginase 


267 


43 


1158 


AF151848 


Homo sapiens 


CGI -90 protein 


232 


32 


1159 


AJ272267 


Homo sapiens 


choline dehydrogenase 


2449 


100 


1160 


AB001773 


Ctona 
savignyi 


PEM-6 


196 


33 


1161 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-10 7 
SEQ ID NO: 107. 


746 


83 


1162 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107. 


746 


83 


1163 


AF113 534 


Homo sapiens 


HP1-BP74 protein 


2723 


96 


1164 


AF232226 


Danio rerio 


Deddl 


191 


41 


1165 


AL118501 


Homo sapiens 


dJ1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em: AL050069) ) 


1051 


71 


1166 


Aniiasoi 


Homo sapiens 


dJ1191NlG.l (A novel protein 
(translation of the cDNA 
DKFZp566A0946 r Em: AL050069) > 


945 


75 


1167 


AP187733 


Homo sapiens 


syntaphilin 


831 


42 


1168 


AB019435 


Homo sapiens 


phosphol ipase 


951 


55 


1169 


AF064604 


Homo sapiens 


KE03 protein 


324 


33 


1170 


Y01164 


Homo sapiens 


Polypeptide fragment encoded 
by gene €. 


1191 


100 


1171 


L03188 


Saccharomyce 
s cerevisiae 


putative 


180 


22 


1172 


AF113751 


Mus musculus 


nuclear pore membrane 
glycoprotein POM210 


3941 


81 


1173 


AJ245417 


Homo sapiens 


G5b protein 


794 


100 


1174 


AL022238 


Homo sapiens 


dJ1042K10.3 (novel protein) 


1285 


100 


1175 


U41278 


Caenorhabdit 
is elegans 


F33G12.3 gene product 


332 


28 


1176 


M35617 


Homo sapiens 


T-cell receptor V-alpha-J- 
alpha region 


284 


83 


1177 


AC012680 


Arabidopsis 
thai i ana 


putative protein phosphatase 
2C; 55455-56414 


209 


37 


1178 


G01345 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5426. 


692 


99 


1179 


AL096767 


Homo sapiens 


dJ579N16.3 (novel protein 
similar to worm, Arabidopsis 
and pine proteins) 


1342 


100 


1180 


AF039716 


Caenorhabdit 
is elegans 


Similar to ATP synthase B 
chain 


496 


55 


1181 


Y11710 


Homo sapiens 


collagen type XIV 


1048 


97 


1182 


XS224 0 


Homo 
sapiens) 
>R94 974 
R94974 09- 
MAY-1996 27- 
OCT-1994 
Human TCL-1 
polypeptide. 


T cell leuJtemia/lymphonia 1 


617 


100 
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SEQ 
ID 

NO: 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






LHomo 
sapiens 








1183 


U42B41 


Caenorhabdit 
is elegans 


short region of weak 
similarity to collagen 


161 


33 


1185 




Homo sapiens 


dicarboxylate carrier protein 


1470 


99 


1186 


L27645 


Danio rerio 


growth-associated protein 


130 


36 


1187 


Y02738 


Homo sapiens 


Human secreted protein 
encoded by gene 89 clone 

HLHFP03 . 


636 


100 


1188 




Xenopus 
laevis 


ornithine decarboxylase- 2 


1459 


60 


1 1 QQ 


AU.L36307 


Homo sapiens 


&T380B8.2 {Neuritin, a 
protein which promotes 
neurite outgrowth) 


182 


33 


1190 


X89602 


Homo sapiens 


rTSbeta 


197 


100 


1191 


U3 2 8 2 8 


Haemophilus 

influenzae 

Rd 


ribosomal protein SB 
modification protein (rimKJ 


268 


31 


1192 


AF154831 


Rattus 
norvegicus 


PV-1 


1403 


60 


1193 


¥50926 


Homo sapiens 


Human fetal brain cDNA clone 
vcl6_l derived protein. 


918 


100 


1194 


AF026530 


Rattus 
norvegicus 


stathmin-like-protein splice 
variant RB3 • > 


1093 


97 


1195 


U3 5244 


Rattus 
norvegicus 


vacuolar protein sorting 
homolog r-vps33a 


29B1 


96 


1196 


Y70470 


Homo sapiens 


Human p53 target molecule, 
PRG3 protein. 


1680 


100 


1197 


AF15731Q 


Homo sapiens 


AD-017 protein 


912 


47 


1198 


AF125443 


Caenorhabdit 
is elegans 


contains similarity to S. 
pombe phosphatidyl synthase 
(GB:Z28295) 


460 


39 


1199 


AF2 01934 


Homo sapiens 


DC12 


1649 


88 


1200 


AL031775 


Homo sapiens 


dJ30M3.3 (novel protein 
similar to C. elegans 
Y63D3A.4) 


1902 


100 


1201 


M21103 


Ovis aries 


BIIIB4 high- sulfur keratin 


484 


82 


1202 


Z85S86 


Homo sapiens 


dJ108K11.3 (similar to yeast 
suppressor protein SRP40) 


1143 


75 


1203 


U18762 


Rattus 
norvegi cus 


retinol dehydrogenase type I 


890 


52 


1204 


U35730 


Mus mus cuius 


jerky 


223ET 


76" 


1205 


AB002327 


Homo sapiens 


KIAA0329 


151 


24 


1206 


AB019233 


Arabidopsis 
thaliana 


ub i qu i none /me nagu i none 

biosynthesis 

methyl transferase- like 


762 


56 


1207 


AL136307 


Homo sapiens 


dJ380B8.2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 


742 


100 


1208 


AF2079B9 


Homo sapiens 


orphan G-protein coupled 
receptor 


2326 


100 


1209 


Z97630 


Homo sapiens 


dJ466Nl.4 (novel protein ~ 
similar to ANK3 (ankyrin 3, 
node of Ranvier (ankyrin 
G) ) ) 


181 


44 


1210 


U21549 


Mu3 mus cuius 


"V.j j/ jjuy j. lit 


n *3 on 


6 8 


1211 


Y27700 


Homo sapiens 


Human secreted protein 
encoded by gene No. 12. 


1267 


100 


1212 


AF117814 


Mus mus cuius 


odd- skipped related 1 protein 


945 




1213 


AF277233 


Naegleria 
fowleri 


calcineurin B 


222 


39 


1214 


D14 849 


Mus musculus 


meiosis-specif ic nuclear 
structural protein 1 


19S0 


77 


1215 


G03022 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7103 . 


590 


100 


12l£ 


Z7251Q 


Caenorhabdit 


similarity to yeast UTR3 


634 


49 
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TABLE 2 



PCT/USOO/34263 



SEQ 
ID 
NO: 


NUMBER 




DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 






is elegans 


protein (Swiss Prot accession 
yjvo / mil , a comes EiTOnl CtllS 

gene 






1217 


249703 


s cerevisiae 


uiiAnown 


13 4 


22 


1218 


** o vj 


Arab i dop s i s 
thai i ana 


f J r 9 . 1 o 


199 


29 


1219 


L10910 


Homo sap i en s 


splicing factor 


1026 


71 


1220 


270750 


Caenorhabdit 
Is elegans 


similar to vanadate 
resistance protein 
transmembranous comes from 
this gene 


965 


SB 


1221 


AL163 815 


jn£aJQ > lClOpSl5 
f- V"| 1 i a nA 


putative protein 


653 


61 


1222 


AF155100 


Homo sapiens 


zinc finger protein KY-REN-21 
antigen 


2261 


100 


1223 


iI05Q7i 




GTP— binding regulatory 

^iuueiu gamma - o G uOUXl 1 L 


356 


100 


1224 


Y73364 


Homo sapiens 


** j> ivi'i cionc &/DOJ72r± proccin 
sequence • 


1169 


99 


1225 


AL050170 


Homo sapiens 


hypothetical protein 


714 


100 


1226 . 


X64002 




RAP74 


2661 


99 


1227 


X04085 


Homo sapiens 


catalase 


284G 


1.00 


1228 


AJO05S20 


Mus musculus 


skeletal muscle-specific gene 


1416 


90 


1223 


AF045564 


Rattus 


development -related protein 


1715 


93 


1230 


X97571 


Mus musculus 


HCMV- interacting protein 


479 


96 


1231 


1.0B239 


Homo sapiens 


located at 0ATL1 


2274 


100 




At IZlODJ 


Homo sapiens 


sorting nexin 14 


1964 


100 


1233 


AC IZXOOJ 


Homo sapiens 


sorting nexin 14 


1203 


84 






Caenorhabd i t 
is elegans 


contains similarity to 
TR:O04595 


744 


31 


"1235 


AC006634 


Caenorhabdit 
is elegans 


contains similarity to 
Saccharomyces cerevisiae 
probable membrane protein 
YLR4 18c {GB :U20162 ) 


357 


33 


1236 


Y18101 


Mus musculus 


macrophage actin-associated- 
tyros ine -phosphorylated 
protein 


1559 


87 


1237 




Homo sapiens 


TGIF2 


1224 


100 


1238 


AB026264 


Homo sapiens 


IMPACT 


1694 


100 


1239 


nlju ZQZDt 


Homo sapiens 


IMPACT 


1123 


100 


1240 




Homo sapiens 


Human secreted protein , SEQ 
xu wu : da j.u . 


324 


100 


1241 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21 ■ 


1363 


53 


1242 


AL035602 


Arabidopsis 
thaliana 


putative protein 


499 


28 


1243 


X76483 


Gallus 
gallus 


Yes-associated protein 


574 


48 


1244 


AF220186 


Homo sapiens 


uncharacterized hypothalamus 

px.ui.ciii niui^ 


503 


100 


1245 


AL021453 


Homo sapiens 


dJ821D11.3 (PUTATIVE protein) 


856 


100 


1246 


AJ27^003 


Ji^^tllLJ Ct CL M ju t^XXo 


w«i protein 


1216 


100 


1247 


Y579I0 


Homo sapiens 


Human transmembrane protein 
HTMPN-34 . 


1369 


98 


124 8 


AC004874 


Homo sapiens 


similar to N- 

acetylgalactosaminyl transfers 
se; similar to Q0753 7 
(PID:gll71989) 


9S7 


100 


1249 


AF199S97 


Homo 
sapiens 


A- type potassium channel 
modulatory protein l 


1139 


100 


1250 


Y13148 


Rattus 
norvegicus 


PAC360B 


1350 


88 


1251 


M24852 


Rattus 
norvegicus 


neuron- specific protein PEP- 
19 


124 


46 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


% 

IDENTITY 


1252 


AF14673 8 


Rattus 
norvegi cus 


testis specific protein 


771 


83 


1253 


G02725 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 68C6. 


419 


97 


1254 


W44375 


Homo sapiens 


Human ubiauitin-eoniucratincr 
enzyme polypeptide. 


104 5 


Q Q ■ .. - 


1255 


AC006538 


Homo sapiens 


BC41195 1 


831 


70 

/ o 


1256 


AB004316 


Bos taurus 


mitochondrial methionyl- tRNA 
transformylase 


1556 


88 


1257 


235094 


Homo sapiens 


SURF- 2 


J. j 34 


97 


1258 


Y13362 


Homo sapiens 


Amino acid sequence of 
"Drofeein 


2383 


100 


1259 


AC006014 


Homo sapiens 


cimi lay t~o UTTO hranaf ny^ninrr 

Drofcein.- simi 1 tr> P1j1T71 
(PID:gl32517) 


_ 

1299 


100 


1260 


AC005099 


Homo sapiens 


match to AI222572 


469 


100 


1261 


V00507 


Homo sapiens 


coding sequence of DHFR (1 is 
1st base in codon) (561 is 
3rd base in codon) 


984 


100 


1262 


X15443 


Rattus sp. 

— — , . i 


gamma-glutamyl transpeptidase 

1* boo; 


697 


32 


12 63 


API R7T 


MUS mUSCUluS 


neuronal PAS 3 


977 


' 94 


1264 


AP178983 


Homo sapiens 


Ras-associated protein Rapl 


433 


97 


12 65 




Homo sapiens 



Human cyclic nucleotide- 
associated protein -1 (CNAP- 
l; . 


2785 


99 


1266 


Y41738 


Homo 
sapiens 


Human PR0541 protein 
sequence. 


1622 


100 


1267 


rl* VOX JtO 


Mus mu senilis 


Edpl protein 


1077 


64 


1268 


U37006 


is elegans 


gene proaucc 


154 


23 


1269 


AP233582 


UUt> IIIUOVUILIO 


ulraSe JtaOJ / 


942 


95 


1270 


AF195951 


Homo sapiens 


signal recognition particle 
68 


3127 


98 


1271 


AL031177 


Homo sapiens 


uuuojni3. j \ wilt/ vc x piOLeiii/ 


1150 


55 


1272 


AP201933 


Homo sapiens 


DC11 


650 


100 


1273 


AF201933 


Homo sapiens 


DC11 — " 


346 


98 


1274 


AL02171O 


Arai>idop sis 
thai i ana 


Dutative m-ot"^in 


1 /I Q 


49 


12 75 


AC004449 


Homo sapiens 


R33683 3 




100 


1276 


Y86295 


Homo sapiens 


Human secreted protein 

HL2AG87, SEQ ID NO: 210. 


1920 


100 


1277 


Y71111 


Homo sapiens 


Human Hydrolase protein* 9 
(HYDRL-9) . 


1576 




1278 


S94421 


Homo sapiens 


T cell receptor eta-exon 


47R 




1279 


Y66695 


Homo 
sapiens 


Membrane - bound pro t & i n 

PR01344 . 


1909 


100 


1280 


AF161380 


Homo sapiens 


HSPC262 


772 




1281 


Y48610 


Homo sapiens 


Human breast tumour - 
associated protein 71. 


779 


100 


1282 


AGO 1544 6 


Arabidopsis 
thaliana 


Similar to AIG1 protein 


406 


35 


1283 


AK024432 


Homo sapiens 


FLJ00022 protein 


403 


35 


1284 


W96153 


X-^t~\tY>/*\ £7 -NTS -J Awn 

noino BcipisnB 


Huma n FADD - mt e r ac t i ng 
protein (FIP) . 


1825 


81 


1285 


AlJ001O19 


Homo sapiens 


ring finger protein 


1301 


100 


1286 


AE003823 


Drosophila 
melanogaster 


CG13178 gene product 


195 


29 


1287 


AP178632 


Homo sapiens 


FEM-l-like death receptor 
binding protein 


3261 


100 


1288 


AC0Q6033 


Homo 
sapiens 


similar to MLN 64; similar to 

138027 {PID:g2135214) 


1195 


100 


1289 


AC006033 


Homo 
sapiens 


similar to MLN 64 ; similar to 
138027 (PID:g2135214) 


668 


93 


1290 


AB023811 


Homo sapiens 


TU3A 


351 


54 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 




SMITH- 
SCORE 


IDENTITY 


1291 


273424 


Caenorhabdit 
is elegans 


C44B9.1 


235 


36 


1292 


Y94871 


Homo 
sapiens 


Human protein clone HP02551. 


1222 


100 


1293 


AF130425 


Homo sapiens 


retinoblastoroa-associated 
protein RAP14 0 


489 


29 


1294 


G03856 


Homo sapiens 


ID NO: 7937 . 


53 8 


99 


1295 


AF133670 


Mus mus cuius 


ARI>-6 interacting protein-2 


367 


51 


1296 


AJ249735 


Homo sapiens 




1142 


100 


1297 


X575S0 


Escherichia 
coli 


^ssffCt protein 


535 


100 


1298 


AF169284 


Hnmn car** - ! ^tie 


LIM and cysteine— rich domains 
protein 1 


1997 


100 


1299 


U41023 


is elegans 


coded for by C. elegans cDNA 
y^o jii , j , cuacu tor oy 
ykl09h8 .5 


324 


29 


1300 


AB024523 




baSsif le Y"llfYt"i** 1 litres f anl-nv 


1206 


100 


1301 


XSS989 


Homo sapiens 


eosinophil cat ionic -related 


737 


99 


1302 


AF007151 


Homo sapiens 


unknown 


1481 


100 


13 03 


X52904 


•Esehe¥*ir»'hi a 
coli 


open reacting irame ^aa x — bb J 


359 


100 


1304 


U19577 


Escherichia 
coli 




... 

242 


93 


1305 


AF266508 


Mus mus cuius 




1409 


97 


1306 


Y57901 


Homo sapiens 


Human transmembrane protein 
HTMPN- 25 . 


932 


100 


1307 


U58750 


Caenorhabdit 
is elegans 


similar to the mitochondrial 

t«£ X 1C Jl £ CUlUJLy 


365 


54 


1308 


AF044774 




■*j*.*sct/v^?ojLfit ciuslsc region 
protein 2 


2681 


99 


1309 


AL078593 


Homo sap i ens 


H«T*3inpi i fVTi&nconi 

UU^lUDl, J. \ Jft. AnWuHU O OU ) 


267 


34 


1310 


X82693 


Homo sapiens 


E48 antigen 


620 


96 


1311 


Z82263 


CapnorhahH-i H 

is elegans 




283 


35 


1312 


AF131213 




enromosome 1 6 open reading 
frame 5 


1493 


100 


1313 


Y41763 


sapiens 


Human PR0938 protein 
sequence . 


1636 


100 


1314 


AF196972 




JM24 protein 


2239 


100 


1315 


AF053356 




insulin receptor substrate 
like protein 


228 


97 


1316 


Y66695 


Homo 
sapiens 


Membrane -bound protein 
PR01344 . 


1909 


100 


"1317 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


2442 


89 


1318 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


1477 


S3 


1319 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


1651 


86 


1320 


X56932 


floino sapiens 


23 kD highly basic protein 


1044 


100 


1321 


AF174605 


Homo 

>Y83086 
Y83086 09- 
MAR-2000 28- 
AUG-1998 F- 
box protein 
FBP-18. 
[Homo 
sapiens 


F-box protein Fbx25 j 


467 


70 


1322 


M61732 


Trypanosoma 
cruzi 


neuraminidase 


214 


24 


1323 


Y17013 


porcine 
endogenous 


pol 


304 


64 
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TAJBLE 2 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






retrovirus 








1324 


AL13B655 


Arabidopsis 
thaliana 


putative protein 


1174 


37 


1325 


AL138655 


Arabidopsis 
thaliana 


putative protein 


946 


35 


1326 


AL133215 


Homo sapiens 


bA108L7.2 (novel protein 
similar to rat tricarboxylate 
carrier) 


1322 


99 ^ 


1327 


AF161541 


Homo sapiens 


HSPC056 


1357 


99 


1328 


Y73346 


Homo sapiens 


HTRM clone 619699 protein 
sequence . 


785 


96 


1329 


L10910 


Homo sapiens 


splicing factor 


912 


82 


1330 


AF146568 


Homo sapiens 


MIL1 protein 


1936 


100 


1331 
13*32 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide. 


232 


39 


Y41741 


Homo 
sapiens 


Human PRO704 protein 
sequence . 


1860 


100 


1333 


AF295096 


Homo sapiens 


zinc -finger protein ZBRKl 


411 


91 


1334 


Z82271 


Caenorhatadit 
is elegans 


Similarity to Mouse kinensin- 
like protein KIF4 comes from 
this gene 


578 


44 


1335 


AE000810 


Me t hanoba c t e 
rium 

thermoautotr 
ophicum 


conserved protein 


290 


43 


13 36 


Y68779 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-11. 


1019 


91 


1337 


AB027003 


Mus mus cuius 


protein phosphatase 


378 


84 


1338 


U648S6 


Caenorhabdit 
is elegans 


weak similariry to TPR 
domains 


215 


40 


1339 


AE001394 


Plasmodium 
falciparum 


protein of the YMR7 family 


170 


29 


1340 


X76717 


Homo sapiens 


MT-11 protein 


2 04 


89 


1341 


AC011914 


Arabidopsis 
thaliana 


putative mutT protein; 68398- 
67881 


289 


45 


1342 


AJ276171 


Homo sapiens 


ASPIC 


2122 


100 


1343 


AF187016 


Homo sapiens 


myosin regulatory light chain 
interacting protein MIR 


2303 


99 


1344 


AC006963 


Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 

(PID:g4650844) 


894 


35 


1345 


AF257466 


Homo sapiens 


N-acetylneuraminic acid 
phosphate synthase 


1880 


99 


1346 


Y25896 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
64. 


1148 


100 


1347 


AJ272073 


Torpedo 
tnarmorata 


male sterility protein 2 -like 
protein 


1664 


58 


1348 


AF161548 


Homo sapiens 


HSPC063 


1018 


98 


1349 


W78128 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
HOSBI96. 


1117 


100 


13S1 


G02144 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6225. 


418 


100 


1352 


D90869 


Escherichia 
coli 


similar to 


2047 


100 


1353 


A12029 


Homo sapiens 


MRP- 14 


613 


100 


1354 


AC005328. 


Homo sapiens 


R26660_ 1, partial CDS 


870 


74 


1355 


AC024876 


Caenorhabdit 
is elegans 


contains similarity to 
SW;RPB1 CRIGR 


829 


61 


1356 


AFD77226 


Homo sapiens 


copine III 


1876 


44 


1359 


AF217188 


Mus mus cuius 


YIP1B 


801 


63 


1360 


AC074331 


Homo sapiens 


2NF234 


3869 


100 


1361 


AL163279 


Homo sapiens 


homolog to cAMP response 


5035 


99 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


% 

IDENTITY 








element binding and beta 
transducin family proteins 






1362 


Z48475 


Homo sapiens 


glucokinase regulator 


3160 


99 


1JDJ 


24 84 75 


Homo sapiens 


glucokinase regulator 


2682 


97 






Homo sapiens 


megakaryocyte- enhanced gene 
transcript 1 protein; MEGT1 
protein 


2055 


99 


1355 


AF116609 


Homo sapiens 


PRO0915 


581 


100 


1366 


AF116609 


Homo sapiens 


PRO0915 


581 


100 


1367 


AL117352 


Homo sapiens 


dJ876B10.3 (novel protein 
similar to C, elegans 
T19B10.6 (Tr:Q22557)) 


2581 


99 


1368 


Y34124 


Homo 
sapiens 


Human potassium channel 
K+Hnovi5 . 


1342 


100 


1369 


AJ245621 


Homo sapiens 


CTL2 protein 


3728 


99 


13 70 


AF008220 


Bacillus 
subtil is 


YtaG 


429 


45 


1371 


X05562 


Homo sapiens 


alpha-2 chain precursor 1AA - 
25 to 1018) (3416 is 2nd base 
in codon) 


5908 


99 


1372 


Z9BD48 


Homo sapiens 


dJ4 08N23.4 (novel DnaJ domain 
protein) 


1296 


99 


1373 


AF154415 


Homo sapiens 


FLASH 


10253 


100 




U202S6 


Rattus 
norvegi cus 


lamina associated polypeptide 
1C 


1567 


69 


13 75 




Homo s ap i en s 


DOCl 


1645 


46 


13 76 


/VuJL J. / Jj / 


Homo 
sapiens 


bA3 93J16.1 (zinc finger 
protein 33a (KOX 31) ) 


250 


60 


1377 


AC005328 


Homo sapiens 


R26660_l, partial CDS 


1126 


100 


J. J / O 


U35113 


Homo sapiens 


metastasis -associated gene 


1823 


69 


13 79 


L»153 13 


Caenorhabdi t 
is elegans 


putative 


858 


58 


1380 


Y25756 


Homo sapiens 


Human secreted protein 
encoded from gene 46. 


1508 


100 


1381 


AB037360 


Homo sapiens 


ANKHZN 


5734 


95 


1382 


AB037360 


Homo sapiens 


ANKHZN 


959 


97 


13 83 


AF237676 


Mus mus cuius 


G beta- like protein GBL 


1721 


96 


13 84 


AF237676 


Mus musculus 


G beta-like protein GBL» 


1043 


70 


1385 


Y5B793 


Homo sapiens 


Human calcium regulatory 
protein CaREG-l . 


715 


100 


13 86 


AF212162 


Homo sapiens 


mnem 


10369 


99 


1387 


AL031685 


Homo sapiens 


dJ963K23.2 (novel protein) 


337 


33 


13 88 


AC004 890 


Homo sapiens 


similar to zinc finger 
proteins; similar to BAA243 80 
>W06316 W06316 03-OCT-1996 
27-APR-1995 TRP-1 protein. 


542 


86 


1389 


AF187989 


Homo sapiens 


zinc finger protein ZNF223 


2665 


99 


J. J s u * 




Homo sapiens 


Zinc finger protein ZNF221 


3459 


100 


1391 




Homo s ap i ens 


PIST 


1410 


97 


1392 


AF282265 


Homo sapiens 


inner centromere protein 

1NCENP 


1794 


99 


1393 


X90840 


Homo sapiens 


axonal transporter of 
synaptic vesicles 


4 584 


99 


1354 


ili, u / O £• t Z* 


^ : 

riomo sapiens 


zinc finger protein SBBIZ1 


3208 


99 


13 95 


G02224 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6305. 


299 


75 


1396 


AC004809 


Arabidopsis 
thai i an a 


Similar to 


130 


34 


1398 


AF242519 


Homo sapiens 


zinc finger protein SBZF3 


181 


65 


1399 


AL133396 


Homo 
sapiens 


dJi06BH6.4 (prion protein 
like protein doppel) 


962 


100 


1400 


Y4.8611 


komo sapiens 


Human breast tumour- 
associated protein 72 . 


817 


99 


1401 


AC004472 


Homo sapiens 


PI. 11659 5 


280 


54 


14 02 


X91489 


sac cha romyce 
s cerevisiae 


putative HMG box 


164 


27 
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TABLE 2 



SEQ 
ID 
NO • 


ACCESSION 
NUMB BR 


SPECIES 

. ., . ,_ 


DESCRIPTION ■ - 

- 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1403 




Homo 
sapiens 


Human transferase TRNSFS-14. 


2842 


100 


1404 


YQ1 OSR 


Mus musculus 


tex261 


1010 


99 


1405 


AB012084 


Mus musculus 


ITM 


194 


" 29 


l"4"o6 — 




Homo sapiens 


GTPase activating protein 


3233 


99 


1407 


AJ010585 


Rattus 
cat bus 


PTB-liJce protein 


2684 


99 


1408 


X75760 


Drosophila 

•llel aOO^aStci 


LRR47 

- 


364 


29 


1409 


U / OOJ.O 


Mus mus cuius 


N- RAP 


804 


48 


1410 


AC005578 


Homo sapiens 


F20 887_l, partial CDS 


835 


63 


1411 




Escherichia 
col 1 


art, hypothetical protein. 


360 


100 


1412 


X01563 


Escherichia 
coix 


L5 (rplE) (aa 1-179) 


911 


100 


1413 


W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33. 


1264 


99 


1414 


AB031051 


Homo sapiens 


organic anion transporter 

OATP-E 


3832 


100 


1415 


M17466 


Homo sapiens 


coagulation factor XII 


3455 


100 


1416 


AF097994 


Homo 
sapiens 


L- Jcynurenine /alpha - 
aminoadipate aminotransferase 


2202 


99 


1417 


AF151077 


Homo sapiens 


HSPC243 


1262 


99 


141B 


Y09945 


Rattus 
norvegicus 


putative integral membrane 
transport protein 


1098 


61 


1419 


U13152 


Mesocricetus 
aura t us 


guanine nucleotide-binding 
protein beta 5 


2179 


76 


1420 


AL162458 


Homo sapiens 


bA465L10,5 (KIAA1176 (novel 
protein, presumed ortholog 
of mouse K-Cl cotransporter 
KCC2) ) 


5696 


100 


1421 


Y99426 


Homo sapiens 


Human PRO1604 (UNQ785) amino 
acid sequence SEQ ID NO: 3 08. 


152 


29 


1422 


Y94923 


Homo sapiens 


Human secreted protein clone 
qsi4_3 protein sequence SEQ 
ID NO: 52. 


4039 


99 


1423 


AF17738 8 


Homo 
sapiens 


cancer-amplified 
transcriptional coactivator 
ASC-2 


10748 


99 


1424 


Y48517 


Homo sapiens 


Human breast tumour- 
associated protein 62. 


1851 


99 




AT aVOO** o 


Homo sapiens 


BM-006 


1454 


89 


1426 




Homo sapiens 


BM-006 


853 


79 


1427 


AF112886 


Bos taurua 


differentiation enhancing 
factor 1 


4693 


95 


1428 




Homo sapiens 


Gu protein 


1372 


63 


1429 


AF161534 


Homo sapiens 


HSPC049 


2853 


78 




A* X2bU4.3 


Mus musculus 


bisphosphate 3 * -nucleotidase 


275 


30 


1431 


Y66718 


Homo 
sapiens 


Membrane-bound protein 
PR01 105. 


1886 


100 






Homo sapiens 


cell recognition molecule 
Caspr2 


568 


100 


1433 


AB044560 


Mus mus cuius 


Gliacoiin 


X92 


34 


1434 


R99900 


Homo sapiens 


NTII-1 nerve protein, 
facilitates regeneration of 

T-\fTk yi poll a 

nerve ceiis • 


707 


51 


1435 


AF220530 


Homo sapiens 


myo- inositol 1-phosphate 
synthase Al 


2904 


100 


1436 


X70944 


Homo sapiens 


PTB-associated splicing 
factor 


1261 


72 


1437 


AF271732 


Homo sapiens 


bridging integrator- 3 


1282 


100 


1438 


Y30811 


Homo sapiens 


Human secreted protein 
encoded from gene 1 . 


595 


98 " " 


1439 


AJ293659 


Homo sapiens 


mucol ipidin 


628 


97 


1440 


AF21913 8 


Homo sapiens 


GGA3 long isororm 


3083 


100 


1441 


AF21913 8 


Homo sapiens 


GGA3 long isoform 


3346 


100 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 




SMITH- 
WATERMAN 


% 

IDENTITY 


1442 


AB039669 


Homo sapiens 


ALKX3 


1944 




1443 


AF237711 


Drosophila 
melanogas ter 


Diablo 


151 


27 


1444 


AJ011896 


Homo sapiens 


Nafl beta orobein 


439 


i o 

j y 


1445 


X73874 


Homo sapiens 


Dhotsnhorul aflp If Inaep 


b z «j j 


' Qft 


1446 


AF214114 


Homo sapiens 


hrpa c t" payri nnnia »aoenri ahoH 

antigen BCAA 




99 


1447 


AF003924 


Homo sapiens 


ANC 2H01 


2645 


99 


1448 




Caenorhabdi t 
is elegans 


contains weak similarity to 
an AMP-binding motif 


2843 


52 


1449 




ncjiuo s &p -L en s 




1184 


89 


1450 


Y9SQ04 


Homo sapiens 


Human secreted protein 
VCb4__l J bSQ ID NO : 4 6 . 


£85 


100 


1451 


AF107203 


Homo sapiens 


ataxin 2 -binding protein 


688 


57 


1452 


Af lU/ZUJ 


Homo sapiens 


ataxin 2 -binding protein 


456 


78 


1453 


ZiJOUJ.1 


MllS TtlUSCulUS 


DMR-N9 


882 


56 


1454 




Homo sapiens 


Protein sequence and 
annotation available soon via 
jj/\o c* a TeaMB u - He iae ice r g , de 


510 


28 


1455 


AL035409 


Homo sapiens 


dJ564Mll.3 (similar to 
sialyltranf erase) 


1356 


100 


1456 


D4 4480 


Mus musculus 


iuAiH-2 protein 


272 


100 


1458 


AF141326 


Homo sapiens 


RNA helicase HDB/DZCE1 


478 


45 


1459 


AF242552 


Gallus 
gallus 


retinovin 


945 


34 


1460 


U11036 


Homo sapiens 


Ibdl 


724 


84 


1461 


AB025258 


Mus musculus 


granuphilin-a 


545 


39 


1462 


Y08134 


Homo sapiens 


acid sphingomyelinase- like 
phosphodiesterase 


2428 


99 


1463 


AC004997 


Homo sapiens 


match to ESTS 243979 
(NID:g5730973 , R19699 
(NID:g774333) 


869 


98 


1464 


AC004997 


Homo sapiens 


match to ESTs 243979 
(NID:gS73097) , R19699 
(NID:g774333) 


869 


98 


14 65 


U32743 


Haemophilus 
influenzae . 
Rd 


£ucose operon protein (fucO) 


315 


50 


1456 


V09022 


Homo sapiens 


Not56-like protein 


2342 


100 


1467 


AC003034 


Homo sapiens 


Homolog of rat kidney- 
specific (KS) gene 


1072 


99 


1468 


AF071544 


Spinacia 
oleracea 
1 


ribulose - 1 , 5 - bi sphosphate 
carboxylase /oxygenase small 
subunit KT- methyl transferase I 


333 


26 


146 9 




Homo s ap i en s 


Human transmembrane protein 
HTMPN-54. 


1053 


100 


147 0 


Ac UJ4DOO 


Rat t us 
norvegicus 


rsecS 


4504 


93 


1471 


Y70467 


Homo sapiens 


Human membrane channel 
protein- 17 (MECHP-17) . 


452 


74 


1472 


AL031033 


Homo sapiens 


C321D2.1 (Riboaomal Large 
Subunit Pseudouridine 


1694 


100 


1473 


AF177292 


WfWTir* oarti pri n 


y enetiiOni.il j 


4026 


98 


1474 


S45936 


Homo sapiens 


nisi 


1101 


50 


1475 


Y86241 


Homo sapiens 


Human secreted protein 
HOABR60, SEQ ID NO: 156. 


1B79 


98 


1476 


AJ010317 


Fugu 

rubripes 


Sand 


1278 


68 


1477 


U42831 


Caenorhabdi t 
is elegans 


coded . for by c , elegans cDNA 
yk99b4.3/ similar to human 
transforming protein 
(PIR:S22157) 


846 


44 


1478 


X62447 


Homo sapiens 


PR 264 


543 


61 


1479 


X82209 


Homo sapiens 


MN1 


7116 


100 


1480 


U10S36 


Pan paniscus 


MHC. class I A 


675 


84 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


ACCESSION 

IN UPIBJIK 


SPECIES 


DESCRIPTION 


SMITH- 
WATBRMAN 
SCORE 


IDENTITY 


1481 


1 O ^3 ^ J? 


Homo sapiens 


dJ991C6.1 (novel protein 
similar to C. elegans 


1274 


65 


1482 


Z98977 


Schi zosac cha 

rorayces 
pombe 


^utotivu vacuwai parous Ail 


256 


29 


1483 


AB005662 


Mus rnusculus 




4968 


92 


1484 


AL050120 


Homo sapiens 


hypothetical protein 


716 


100 


1485 


K27878 






1006 


53 


1485 


Y69161 


Homo sapiens 


Amino acid sequence of a 
paiLidx jJiuucj.il Kinase. 


575 


99 


1487 


X84156 


Saccharomyce 


ATH1 


341 


29 


1488 


AF038963 


Homo sapiens 


RNA helicase 


446 


34 


14 89 




uaerioinaJDai c 


coded for by C. elegans cDKA 
y.-cjuDJ.b; coaea tor ny C. 
elegans cDNA yk30b3.3 


620 


42 


1490 


AE GOOQRO 


Archaeoglobu 
s £ulgi<iu3 


enoyl-CoA hydratase (fad-4) 


533 


46 


1451 


riwvQ J J 


Rat t us 
norvegicus 


adenylyl cyclase type IV 


707 


95 


1492 


Y73342 


Homo sapiens 


HTRM clone 2709055 protein 
sequence . 


3 513 


99 




Vt *TT> ft 


Homo sapiens 


Human secreted protein (clone 
f j283-ll) . 


4 62 


37 


1494 


AF133670 


Mus musculus 


ARL-6 interacting protein- 2 


701 


97 


1495 


Y94897 


Homo 
sapiens 


Human protein clone HP10574. 


1371 


100 






Homo sapiens 


dJ747H23.2 (novel protein) 


1550 


100 


1497 


AP037447 


Homo sapiens 


ribosoraal S6 protein kinase 


2427 


100 


1498 


AL445067 


The rmop 1 a s ma 
acidophilum 

- 


putative target YPL207w of 
the HAP 2 transcriptional 
complex related protein 


269 


35 


.1.499 


inm q a d *7 


Homo sapiens 


Xlll>-binding protein 51 


227 


36 


1500 


AJ2777<?fV 


Homo sapiens 


UBA5H3A protein 


3509 


100 


1501 


AJL050333 


Homo 
sapiens 


d*T93K22.1 (novel protein 
(contains DKFZP564B116) ) 


2439 


100 


1502 


AF17989S 


Homo sapiens 


TALE home ob ox protein Me is 2b 


1140 


100 


1503 — 


API 7 R 94 ft 


Homo sapiens 


TALE homeobox protein Me is 2a 


1177 


100 


1504 


Y5300S 


Homo sapiens 


Human secreted protein clone 
pn749_ 8 protein sequence SEQ 
ID N07l6 . 


1442 


99 


1505 


X82494 




f ibuiin-2 


3580 


99 


1506 


X98296 


Homo sapiens 


ubiquitin hydrolase 


783 


42 


lS07 


AL034548 


TJ/~s|Yi/'S a rr\ t-i ^ d t*\ a 

c\\j\\\\j scipisns 


cajiiujij / . t> (novel protein) 


109B 


100 


1508 


Y76144 




Human secreted protein 
cii\>uucu ijy gene zi. 


1736 


100 


1509 


AF220182 


Homo sapiens 


uncharacterised hypothalamus 
proLein niuuo 


1181 


98 


1510 


U64$01 


Caenorhabdi t 


Gene probably begins in the 
next cosraid 


415 


58 


1511 


AL-356192 


Neurospora 


related to MDMi protein 


196 


29 


1512 


D17629 


Homo 
sapiens 


N-acetylgalactosamine 6- 
sulfate sulfatase (GALNS) 


1829 


100 


1513- 


AF168717 


Homo sapiens 




C Q A 


— 


1514 


AJ243531 


Homo sapiens 


nM15 protein 


735 


100 


1515 


AC003672 


Arabidopsis 
thai i ana 


putative C3HC4-type RING zinc 
finger protein 


407 


30 


1516 """"" 


AF11543 5 


Rattus 
norvegicus 


syntaxin 17 


1374 


90 


1517 


AF00314 0 


Caenorhabdit 
is elegans 


C44S4.5 gene product 


274 


31 


1518 


ABO 025 8 4 


Rattus 
norvegicus 


beta- alanine -pyruvate 
aminotransferase 


2238 


82 


1519 


AH21764 | Schxzosaccha 


yeast atpl2 protein precursor 


270 


30 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
-NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






romyces 
pombe 


homolog 






1520 


AF255910 


Homo 
sapiens 


vascular endothelial 
junction-associated molecule 


547 


100 


1521 


D31764 


Homo sapiens 


KIAA0O64 


170 


27 


1522 


Y66634 


Homo 
sapiens 


Membrane -bound protein 
PRO190. 


985 


100 


1523 


Y94450 


Homo sapiens 


Human inflammation associated 
protein 


250 


43 


1524 


AC0O01O7 


Arabidopsls 
thaliana 


F17F8,22 


277 


37 


1525 


AF109377 


Mus mus cuius 


IdlBp 


1277 


83 


1526 


AL031427 


Homo sapiens 


dJ167A19.4 (novel protein) 


1432 


99 


1527 


Y08135 


Mus mus cuius 


acid sphingomyelinase- like 
phosphodiesterase 


1496 


79 


1528 


AK024423 


Homo sapiens 


FLJO0Q12 protein 


611 


100 


1529 


AF154502 


Homo sapiens 


dipept i daoe 


o /y 


100 


1530 


AF205598 


Homo sapiens 


t ran sposase- like protein 


1368 


100 


1531 


AF25103S 


Homo sapiens 


putative zinc finger protein 


1420 


50 


1532 


W74805 


Homo s ap i en s 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 




57 


1533 


AF039023 


Homo sapiens 


Ran- OTP binding protein; 
RanBF6 


5707 


Q Q 


1534 


AC007190 


Arabidopsls 
thaliana 


F23N19.9 


3 74 


Tt 


1535 


AB027564 


Homo sapiens 


DINB1 


4482 


100 


1536 


Y36178 




Human a^cif efced nrotpin 


1 "7*7 


B7 


1537 


Y50907 


Homo sapiens 


Human fetal brain cDNA clone 
vb3_l derived protein. 


3S93 


99 


1538 


AF017368 


Mus mus cuius 


f aciocrenital HvctnlaQia 
protein 2 


1 *7"7 
i. / f 


47 


1539 


AF266756 


Homo sapiens 


fiohinQosin^ kinaflp 


Z J1JL 


99 


1540 


Z48804 


Homo sapiens 


OA1 




100 


1541 


AF000195 


Caenorhabdit 
is elegans 


Contains similarity to Pfam 
domain : PF00169 (PH) 
Score=20.6, £-value=*l . 9e- 05 
N=l 


379 


42 


1542 


Y71159 


Homo sapiens 


Human phosphodiesterase 
interacting protein, 
myomegalin. 




99 


1543 


X76092 


Homo sapiens 


DNA binding protein RFX3 


3327 


100 ' "™ 


1544 


AB015330 


Homo sapiens 


HRIHFB2007 


631 


. 50 


1545 


AF1984S7 


Homo sapiens 


transcription factor LBP-ib 


2822 


100 


1546 


AF016417 


Caenorhabdit 
is elegans 


Similar to BZIP transcription 
factor 


518 


42 


1547 


X55885 


Homo sapiens 


KDEL receptor 


1106 


100 


1548 


AB035495 


Carassius 
aura bus 


ubiquit in-activating enzyme 
El 


836 


42 


1549 


AL021707 


Homo sapiens 


dJ50BI15.4 <KIAA0668) 


3688 


100 — — 


1550 


AJ223978 


Bacillus 
subtilis 


YvqK protein 


292 


42 


1551 


AF145615 


Drosophila 
melanogaster 


BCDNA.GH03377 


822 


44 


1552 


AL157734 


Schizosaccha 

romyces 

pombe 


putative mannosyl transferase 
involved in N-glycosylation 


435 


37 


1553 


AF079S27 


Mus musculus 


iER5" 


691 


63 


1554 


AB026291 


Rattus 
norvegicus 


acetoacetyl-CoA synthetase 


1099 


88 


1555 


Y44722 


Homo sapiens 


Human immune system molecule, 
ISMO-3. 


1780 


99 


1556 


AF116553 


Drosophila 
melanogaster | 


antennal-specific short-chain 
dehydrogena se/reduetaee 


277 


32 


1557 


Y71056 


Homo sapiens | Human membrane transport 


1975 


99 
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SEQ 
ID 
NO : 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








protein, Mruf-l* 






1558 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-1. 


1975 


99 


1559 


Y71056 


U^MYfc^ C! ^ VS H £3 ft G 

noiuu JLtiiiti 


Human membrane transport 
protein, MTRP-1 . 


1894 


97 


1560 


AF0920SO 


Ml 1 C TY\} 1 G r^l ll 11 Q 




262 


44 


1561 


AL109827 


Homo sapiens 


6M3 09IC20.2 (acrosomal protein 
ACR55 (similar to rat sperm 
antigen 4 (SPAG4 ) ) ) 


1607 


97 


1562 


AJ131890 


Homo sapiens 




3002 


100 


1563 


AL035424 


Homo sapiens 


dA22D12.1 (novel protein 
similar to Drosophila Kelch 


301S 


100 


1564 


AC002400 


^onio Rani <=»ti s 


oene proaucc wj.cn similarity 
l.u uuiquiLin oinaiijg enzyme 


2790 


100 


1565 


AC005306 


Homo sapiens 




919 


82 


1566 


AF000195 


Caenorhabdit 
is elegans 


Contains similarity to Pfam 
aamain: FFQ0169 IPH/ , 
Score=20.6, E - value =1 . 9e-05 , 
N=l 


550 


45 


1567 


AB033281 


Homo 
sapiens 


F-box and WD- repeats protein 
beta-TRCP2 isoforra C 


2879 


100 


15^8 


D4 9473 


Mus mus cuius 


truncated form of Soxl7 


1047 


78 


1553 


AK025270 


Homo sapiens 


unnamed protein product 


210 


91 


1570 


X75756 


Homo sapiens 


protein Kinase C mu 


4797 


99 


1571 


AF145713 


Homo sapiens 


SCHIP-l 


2388 


100 


1572 


AS003831 


Drosophila 
melanogas ter 


CG18445 gene product 


180 


31 


1S73 


AF074603 


Streptomyces 
griseus 
subsp . 
griseus 


NonF 


205 


38 


1574 


U28993 


Caeno r habdi t 
is elegans 


F22D3.3 gene product 


144 


27 


13 /3 


At 129507 


Homo sapiens 


transcription factor ICBP90 


287 


68 


1576 


X64878 


Homo sapiens 


oxytocin receptor 


2002 


160 


J. O / f 


Ar z j / / ± x 


Drosophil a 
me 1 anogas ter 


Diablo 


421 


54 


1578 


G00975 


Homo sapiens 


Human secreted protein, SEQ 
xu jvu : . 


480 


100 


1579 


AF248744 


ium parvum 


adhesive protein 


X23 


33 


1580 


AL121782 


Homo sapiens 


UU30311*. z \iluvcx protein 
(translation of cDNA 
Em:AK000219) ) 


663 


100 


1581 


AF041853- 


Homo sapiens 


kinesin family member protein 
KIF3A 


345 


33 


1582 


AF025441 


Homo sapiens 


Opa- interacting protein 0IP5 


1198 


100 


1S83 


AE001803 


Thermo toga 
maritiraa 


crlvcerate Jcinase. mihaH vp 


349 


34 


1584 


AF252283 


Homo sapiens 


Kelc!h*»lilc*5> 1 t)T*r»t"ia^n 


O H /-I 


100 


1585 


AF169675 


sapiens 


hranQiiipnihirflnp m^rih pi n TJT.DTl 


3494 


99 


1586 


AF118274 


Homo sapiens 


DNb-5 


2628 


97 


15^87 


X79440 


Homo sapiens 


NADP+- dependent malic enzyme 


3167 


99 " 


1588 


X99802 


Homo sapiens 


ZYG homologue 


3966 


99 


1589 


AF169803 


Homo sapiens 


tlavohemoprotein b5+bSR 


2563 


100 


1590 


Y29861 


Homo sapisns 


Human secreted protein clone 
cb98_4 . 


181 


47 


1591 


Z25535 


Homo sapiens 


nuclear pore complex protein 
hnupi53 


7567 


99 


1592 


X13293 


Homo sapiens 


B-tnyb protein (AA 1-700) 


3678 


99 


1593 


M74027 


Homo sapiens 


mucin 


242 


27 


1594 


AL139314 


Schizosaccha 
romyces 


hypothetical protein 


235 


54 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 

SCORE 


IDENTITY 






pombe 








1595 


W78324 


Homo sapiens 


Fragment of human secreted 
protein encoded by* gene 81 


1318 


98 


1596 


Y94906 


Homo sapiens 


Human secret ed protein clone 
rb649 3 protein sequence SEQ 
ID N07l8. 




98 


1597 


AF17460S 


Homo sapiens 


F-box protein Fbx2 5 


1408 


99 


1598 


AB032254 


sapiens 


*^ ' ^ w4Uky\At_A u%^x -i. ai qv*j cii»?5iit to zinc 
f i ngf e r domain 2A 


9676 


98 


1599 


X73114 


Homo sapiens 


slow MyBP-C 


5568 


95 


1600 


X82200 






2305 


100 


1£J01 


Y00876 


Homo 
sapiens 


Human LAPH-1 protein 


1149 


98 


1602 


AJ223351 


Homo sapiens 


niKA-inceracuxng protein 3 


2821 


99 


1603 


AJ222801 




neuc ra i sp m ngomye i ina s e 


2268 


99 


1604 


AJ222801 


Homo sapiens 


neutral sphingomyelinase 


1601 


99 


.1605 


AF185576 


Mna rail A r*i t H i i a 


POZ/zinc finger transcription 


3435 


97 


1606 


AF093744 


Homo sapiens 


unknown 


131 


100 


1607 


A12142 


construct 


IFN-pseudo- omega 2 


800 


98 


1608 


YS7949 




rxuuian c~can3m6iTiorane protein 
HTMPN-73. 


1863 


100 


1609 


AF151044 


Homo sapiens 


HSPC210 


681 


97 


1610 


X15218 




sxx procein iaa i - 728} 


3765 


100 


1611 


Y08200 


Homo sapiens 


rail geranylgeranyl 
transferase 


2976 


100 


1612 


AF220560 


i^UUIW 9 v4 JL/ •!» J A £j 


s>/ j\ procein 


2486 


99 


1613 


AC0D4481 


Arabidopsis 


nodulin-like protein 


371 


"26 


1614 


Y09501 




— 

w/UJii- cytocnrome-b5 reductase 


1607 


100 


1615 


Y15521 


Homo aaT}if*nfl 


start position i 


3150 


37 


1616 


AJ010750 


Rattus 
norvegicus 


Castration induced prostatic 
apoptosis related protein- l. 


890 


62 


1617 


X58079 


Homo sapiens 


S100 alpha protein 


481 


100 


1618 




Homo 
sapiens 


Membrane -bound protein 

FRO1009 , 


967 


100 


1619 


AJ242973 


Homo sapiens 

s . 


peptide methionine sulfoxide 
reductase 


929 


100 


1620 


AF150733 


Homo sapiens 


AD- 014 protein 


288 


100 


1621 


AJ007509 


Homo sapiens 


ElB~55kDa-associated protein 


4646 


98 


1622 


X64177 


Homo sapiens 


metallothionein 


380 


100 


1623 




Archaeoglobu 

±> JLUigiUUS 


A. fulgidus predicted coding 
region AF0859 


240 


36 


1624 


AL355013 


romyces 
pombe 


mitochondrial carrier protein 


403 


34 


1625 


Y66746 


sapiens 


Membrane- bound prot e i n 
PR01198 . 


1184 


100 


162S 


D90053 


Sus scrofa 


destrin 


8 53 


100 


1627 


Y35954 


Homo sapiens 


Extended human rz&r~> >-« i- art 

AJ^fci* ouucu 1 lui ilex 1 1 oCCiSLSQ 

protein sequence, SEQ ID NO. 
203 . 


756 


100 


1628 


AL031775 


Homo sapiens 


dJ30M3.2 (novel protein) 


470 


100 


1629 


AF132484 


Mus mus cuius 


unknown 


28£ 




1630 


AF01709£ 


Drosophila 
melanogaster 


similar to C. elegans 
R10H10.6 and S. cerevisiae 
YD8419.03C 


493 


61 


1631 


X03077 


Homo sapiens 


laccate dehydrogenase -A 


1704 


100 


1632 


AF151084 


Homo sapiens 


HSPC250 


7^3 


100 


1633 


AJ001874 


Homo sapiens 


orf 


255 


97 


1634 


AC012187 " 


Araoidopsis 
thai i ana 


Contains weak similarity to 
GATA-6 DNA-binding protein 
gb|H36135, gb|Z2620Q come 
from this gene. 


143 

i 


38 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATRRMAN 
SCORE 


IDENTITY 


1635 


AF026246 


Homo sapiens 


HERV-E integrase 


411 


90 


1636 


Y50943 


Homo sapiens 


Human adult brain cDNA clone 
ve8_l derived protein. 


1126 


95 


1637 


AF134593 


Homo sapiens 


I»-pipecolic acid oxidase 


2068 


99 


1638 


AJ238247 


Mus musculus 


putative phosphatase subunit 


1948 


96 


1639 


Y94942 


Homo sapiens 


Human secreted protein clone 
yk25l 1 protein sequence SEQ 
ID NO:90. 


1320 


100 


1640 


AF235030 


Homo sapiens 


BM88 antigen 


766 


99 


1641 


AF233288 


Drosophila 
melanogaster 


WDS 


358 


26 


1642 


M19351 


Mus musculus 


immunoglobulin heavy chain 
binding protein 


145 


34 


1643 


Y70452 


Homo sapiens 


Human membrane channel 
protein- 2 (MECHP-2) . 


1352 


100 


1644 


AF176520 


Mus musculus 


WD repeat- containing F-box 
protein FBW5 


267G 


88 


1645 


W67B16 


Homo sapiens 


Human secreted protein 
encoded by gene 10 clone 
HCEMU42 . 


1156 


100 


1646 


X67155 


Homo sapiens 


mitotic kinase-like protein-l 


4456 


99 


1647 


MS3180 


Homo sapiens 


threonyl-tRNA synthetase 


1040 


61 


164B 


Y87342 


Homo sapiens 


Human signal peptide 
containing protean HSPP-119 
SEQ ID NO:119. 


1566 


93 


1649 


R95332 


Homo sapiens 


Tumor necrosis factor 
receptor 1 death domain 
ligand (clone 3TW) . 


4137 


100 


"'1650 


AC007136 


Homo sapiens 


Putative map kinase 
interacting kinase 


856 


99 


1651 


AB015346 


Homo sapiens 


EpslSR 


4464 


99 


1652 


AL161576 


Arabidopsis 
thaliana 


putative protein 


1341 


48 


1653 


AC005313 


Arabidopsis 
thaliana 


putative calmodulin 


288 


28 


1654 


AL031428 


Homo sapiens 


dJ184J9.1 (KIAA0601 protein) 


3526 


100 


1655 


AL031428 


Homo sapiens 


dJl84J9.1 (KIAA0601 protein) 


3526 


100 


1656 


AB017910 


Dictyosteliu 
m discoideum 


myoM 


297 


32 


1657 


Y28919 


Homo 
sapiens 


Human regulatory protein 
HRGP-S. 


2251 


99 


1658 


AF056191 


Homo sapiens 


TPA inducible protein 


2744 


98 


1659 


U76846 


Arabidopsis 
thaliana 


ubiqui tin- specific protease 


137 


35 


1660 


AL078627 


Schizosaccha 

romyces 

pombe 


actin-like protein; (2 actin 
domains) 


320 


34 


1662 


X52022 


Homo sapiens 


collagen type VI, alpha 3 
Chain 


16274 


99 


1663 


AF3 00648 


Homo 
sapiens 


guanine nucleotide binding 
protein beta subunit 4 


1811 


100 


1664 


AF214736 


Homo sapiens 


EH domain containing protein 
2 


2774 


100 


1665 


Z48613 


Saccharorayce 
s cerevisiae 


unknown 


138 


26 


1666 


AF177385 


Homo 
sapiens 


cytochrome c oxidase assembly 
protein isoform 2 


1395 


99 


1667 


AC007842 


Homo sapiens 


BC331191__1 


1581 


47 


1668 


S67513 


Borna 
disease 
virus BDV, 
WT-1, Halle 
Bl/91, horse 
brain, field 
isolate, 
Peptide, 370 


p40 


397 


43 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


QMTTU. 

SCORE 


% 

XJJiMM llll 






aa 








1669 


Z99753 


Schi zosaccha 

rorayces 

pombe 


putative NOLI -K0P2- sun family 
nucleolar protein 


569 


4 7 


1670 


G03130 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7211. 


427 


"97 


1671 


M96625 


Gallus 
gallus 


cardiac muscle tensin 


1185 


54 


1672 


API 744 82 


Homo sapiens 


polycomb 3 


2005 


99 


1673 


Y51B46 . 


Homo sapiens 


Human T R 1 Virvmnl r\rs nrnhoi n 

fragment . 


233 


29 


1674 


AF255334 


Homo sapiens 




— T"P-=5 

ibi 


29 


1575 


Y94367 


Homo 
sapiens 




10 9 


30 


1676 


Y2S712 




encoded from gene 2 . 


3043 


99 


1677 


Y25712 


Homo s ap i en s 


nuiuetii becic leu piOCcin 

encoded from gene 2 . 


1580 


91 


1678 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 

£■*-»- C V U JL ^UL 


170 


17 


1679 


AF163151 




Hpn h i n ci al r~iy\ Vi r\ V\ ^•n^t^^ a ^ v% 

uc ^ l. j.xi sirtiopnospnoproteiii 
precursor 


170 


17 


1680 


AK024453 


Homo sapiens 




1349 


100 


16B1 


AF019236 


Dichvoe fceliii 
m discoideum 


TipD 


613 


34 


16B2 


AJ243459 


major 


£f ±- w w ^»v~/^#^iv_j QjkJi ivM jl y mil 


153 


26 


1683 


Z69369 


Schi zosaccha 

romyces 

pombe 


DUtative GTP-hi nriinrr rirrit-** i « 
t* *— ■* *^ *-<j,*Jiv<iJHJiy ux uLC Jill 


SOU 


46 


1684 


X94 910 


Homo sapiens 


ERp28 




100 


1685 


AF28647S 


Takifugu 
rubripes 


retinitis pigmentosa GTPase 
regulator-like protein 


19 6 '"" 


19 


1686 


AF191298 


Homo sapiens 


vacuolar sorting protein 35 


4087 


100 


1687 


AJ2 75986 


Homo sapiens 


transcription factor 


2958 


100 


1688 


AJ275986 


Homo sapiens 


tranacription factor 


1886 


88 


1689 


X07311 


Droscphila 
me 1 anoga s t e r 


hps 1* ^shnrlr nmhAi n 


138 


43 


1690 


AF240463 


Rattus 

no rveg i cus 


LIS1- interacting protein 
NUDE1 


1383 


83 


1691 


AJ272078 


Homo sapiens 


APOBEC-1 stimulating protein 


1256 


68 


1692 


AJ272079 


Homo s sip i ens 


exr\jaa\~— x t>L J-inuxaLllig protein 


1336 


60 


1693 


AF177942 


laevis 


katanin p60 


1 664 


66 


1694 


AF263539 


Homo sapiens 


arcrinine Mxmehhvlt'ranef'^raea 

^■^-^ ^ * lire ± im.XtCkti.t9 IciTdbc 


X / f*k 


100 


1695 


AF2^2^B9 


Homo 
sapiens 


protein arginine N~* 

methyl transferase 1- variant 2 


110£ 


81 


1696 


AK000193 


Homo sapiens 


unnamed protein product 


1060 


100 


1697 


AB041035 


Homo sapiens 


kidney superoxide -producing 
NADPH oxidase 


3122 


100 


1698 


AB041035 


Homo sapiens 


kidney superoxide- producing 
NADPH oxidase 


2181 


100 


1699 


AF025772 


Homo sapiens 


C2H2 zinc finger protein 


488 


54 


1700 


Y44676 


Homo sapiens 


Human ARF- Related Protein- 1 
(HARP-i) . 


938 


97 


1701 


AK022467 


Homo sapiens 


unnamed protein product 


315 


98 


1702 


AB024574 


Homo sapiens 


GTP- binding like protein 2 


1172 


100 


1703 


AF05507B 


Homo sapiens 


zinc finger protein 42 


421 


52 


1704 


AF198092 


Kus musculus 


RP42 


1057 


77 


1705 


AE003573 


Droeophila 
melanogaster 


CG12474 gene product 


161 


33 


1706 


AB036345 


Drosophila 
melanogaster 


aquaporin 


164 


24 


1707 


Y55927 


Homo sapiens 


Human STLK2 protein. 


2146 


100 


1708 


U27121 


Danio rerio 


G12 


212 


47 


170$ 


AI391710 


Arabidopsis 


putative protein 


505 


50 
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SEQ 
ZD 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION . 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






thaliana 








1710 


B01311 


Homo sapiens 


Human PR0241 polypeptide. 


1649 


97 


1 *71 1 
X I -L-L 


tia men ' " 


Mus mus cuius 


formin binding protein 30 


4561 


85 


1712 


z^Tm in a 


Mus mus cuius 


skeletal muscle and cardiac 
protein 


1490 


B9 


1713 


AF255303 


Homo 
sapiens 


membrane- associated nucleic 
acid binding protein 


4416 


99 


1714 


AF2S530T 


Homo 


membrane -associated nucleic 
acid binding protein 


2960 


100 


1715 


t*08227 


Rattus 

UUL VcU 1CU9 


Ras- related protein 


511 


51 


1716 


AF16879S 


Rattus 

XI02T Veg 1 CU S 


schlaf en-4 


1129 


44 


1717 


AF196304 


Homo sapiens 


SUMO-i-specific protease 


5S04 


99 


1718 




Homo sapiens 


HMG2 OA 


1782 


100 


1719 


AB029333 


Halocynthia 
roretzi 


HrPET-1 


1069 


46 




At 0713 17 


Mus rausculus 


C0P9 complex subunit 7b 


1297 


97 


1721 


AJ272215 


Homo sapiens 


HEYL protein 


1681 


99 


1722 


f** f \ *i a o **i ™ "' 

Cj01982 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6063 . 


718 


100 


1723 


AL032643 


Caenorhabdi t 
is elegans 


similar to TJncharacterized 
protein family UPF0034, 


825 


41 


1724 


G01972 


Homo sapiens 


Human secreted protein, SEQ 
ID NO j 6053. 


58£ 


92 


1725 


Y94441 


Homo 
sapiens 


Human Adipose Specific 
Protein 1. 


1231 


100 


1726 


AF2S5443 


Homo sapiens 


CGI -2 01 protein 


4397 


99 


1727 


AF183426 


Homo sapiens 


HT004 protein 


1810 


99 


1728 


D10884 


Bos taurus 


neurocalcin 


1002 


99 


1729 


Z18529 


Gallus 
gallus 


tensin 


1411 


84 


1730 


Z73423 


Caenorhabdit 
is elegans 


cDNA EST EMBL:Z1490a comes 
from this gene-cDNA EST this 
gene 


233 


41 


1732 


AF090891 


Homo sapiens 


PRO0105 


470 


30 


1733 


AJ277724 


Homo sapiens 


hi stone deacetylase 8 


2015 


100 


1734 


G04050 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8131. 


503 


95 


1735 


D45913 


Mus mus cuius 


leucine- rich-repeat protein 


3531 


94 


1736 


AF096709 


Drosophila 
virilis 


failed axon connections 
protein 


276 


32 


1737 


AF195120 


Homo sapiens 


dynactin p62 subunit 


2417 


99 


1738 


1*15314 


Caenorhabdit 
is elegans 


contains similarity to Pfam 
family PF01772 N=l 


206 


37 


1739 


X54618 


Listeria 

monocytogene 

s 


phosphadidyl inositol specific 
phospholipase C 


134 


27 


T 1 A ft 


AL»0316 5B 


Homo sapiens 


dJ310Ol3.4 (novel protein 
similar to predicted C. 
elegans an C. intestinalis 
proteins) 


123 


31 


O. /ft J. 


Y35924 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
173 . 


1013 


99 


1742 


AC0133 54 


thaliana 


r J.DX1XD .13 


202 


32 


1743 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD0S. 


1932 


59 


1744 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08 . 


1S54 


61 


1745 


AF221098 


Homo 
sapiens 


Rai guanine nucleotide 
exchange factor RalGPSIA 


1224 


70 


1746 


Y99372 


Homo sapiens 


Human PR01430 (UNQ736) amino 
acid sequence SEQ ID NO: 116. 


1332 


99 


1747 


Y94294 


Homo sapiens 


Human coenzyme A-utilising 


842 


100 
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SEQ 
ID 
NO; 


ACCESS TON 
NUMBER 




DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








enzyme CoAEN-2 . 






1748 


AK02443 6 


Homo sapiens 


FLJ00026 protein 


1.619 


100 


1749 


AEOQ0877 


riutn 

thermaaufcotr 
ophicum 


conserved protein 


231 


36 


17S0 


AF101361 • 


melanogaster 


rtonormaj, a segregation 


193 


33 


1751 


Y15067 


Homo sapiens 


2NF232 


B89 


100 


1752 


^1 fl"3 ft 


Homo sapiens 


GAP -like protein 


822 


100 


1753 


ALUUJ \J j J 


Homo sapiens 


UAXSTUiKUJj- BINDING PROTEIN; 
45% simiXarity to P22D59 
\fj.Lf . gx^y j uo J 


352 


57 


1754 


X690B9 


Homo sapiens 


165kD protein 


5703 


99 


1755 


AT. OA Q*7QC 


Homo sapiens 


ajbxzus .3 (novel protein) 


1039 


100 


1756 


AI.031393 


Homo sapiens 


CU733D15.1 (Zinc-finger 
protein) 


2765 


100 


1757 


AB046672 


Homo sapiens 


UDP-GalNAc: polypeptide N- 

acetylgalactosaminyltransfera 

se 


2020 


99 


1758 


AL022238 


Homo sapiens 


C3J1042K10.4 (novel protein) 


776 


43 


1759" 


RBI I 


Homo sapiens 


double homeobox protein 


375 


54 


17^0 


Y12065 


Homo sapiens 


hNop56 


2959 


99 


1 761 


BT f\A DTI O 


Homo sapiens 


dJ686C3.2 (nucleolar protein 
hNop56J 


2595 


99 


1762 




Homo 
sapiens 


Gene product with similarity 
to dynein beta subunit 


1542 


51 


17G3 


AF169017 


Homo sapiens 


formimi no transferase 
cyclodeaminase 


8 77 


100 




U91541 


Homo sapiens 


human formiminotransf erase 
cyclodeaminase (f ted) protein, 
carboxy- terminal end 


596 


100 


1765 


HOUJ.J jQ3 


Bacillus 
ha 1 oduran s 


YlqF 


350 


34 


176^ 


Y38421 




Human secreted protein 
encoded by gene No. 36. 


145 


71 


1767 


ACO09176 


Arabidcpsis 


putative ribulose-l, 5- 
bi sphospha te 

carboxylase/oxygenase small 
tfuuuiijLu m -metnyi cran si erase I 


21Ei 


27 


1768 


AK000647 


Homo sapiens 


unnamed protein product 


737 


99 


1769 


AJ238982 


nuuiu oapicuo 


vwwj protein 


2665 


99 


1770 


U73522 




AMSH 


1214 


56 


1771 


U89435 


Kus muscuius 


unknown 


829 


86 


1772 


S70011 


Rattus sp. 


tricarboxylate carrier 


1604 


95 


1773 




Homo sapiens 


CIJ44A20.2 (novel protein) 


2036 


100 


1774 


Y99426 


Homo sapiens 


Human PR01604 (UNQ785) amino 
acid sequence SEQ ID NO: 308. 


1057 


99 


1775 


AF110330 




glutaminase 


3146 


100 


1776 


AJ269529 


Homo sapiens 


glycerol 3 -phosphate permease 


2787 


100 


1777 


Z81579 


t-at? HGt ilcLDul C 
*i e a *) f»rt3 "n o 


cDNA EST yk76fl.5 comes from 
this gene 


232 


31 


1778 


AY007239 




monooxygenase X 


1875 


99 


1779 


AL109608 


Schizosaccha 
rotnvces 

n k \^ o 

potnbe 


oxyscerol -binding protein 
family 


644 


38 


178 0 


AP254260 


Homo sapiens 


tuf telin l 


1729 


100 


1781 


L07924 


Mus muscuius 


guanine nucleotide 
dissociation stimulator 


247 


SO 


1782 


AF295773 


Homo 
sapiens 


ral guanine nucleotide 
dissociation stimulator 


142 


49. 


1783 


AK024475 


Homo sapiens 


FLJ00068 protein 


4333 


100 


1784 


AK024475 


Homo sapiens 


FLJ00068 protein 


3996 


93 


1785 


G03933 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8014. 


570 


100 


1786 


S82637 


Homo sapiens 


Ig lambda- like gene/beta- 


247 


100 
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TABLE 2 



SEQ 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 


% 


ID 


NUMBER 






WATERMAN 


IDENTITY 


NO: 








SCORE 










glucuronidase exon 11 homolog 







TRADOCS: 1 4 1 6280. 1 (%CT40 1 ! . DOC) 
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TABLE 3 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS *• 


2 


BL00240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 8.250e- 
12 157-181 


3 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 S.085e- 
13 358-3B1 


4 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.400e- 
10 1129-1146 BL00028 
16.07 1.257e-09 820- 
837 


5 


BL00023 


Type II fibronectin 
collagen- binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4,545e-27 353- 
390 


6 


BL00 023 


Type II f ibronec t i n 
collagen- binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


7 


BL00023 


Type II fibronectin 
collagen- binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


8 


BL00023 


Type II f ibronec t in 
collagen-binding domain 
proteins. 


BL0OO23 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


9 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 5.119e- 
09 863-917 


10 


PRO0464 


E- CLASS P450 GROUP II 
SIGNATURE 


PR00464D 17.40 €.182e- 
12 294-312 PR00464G 
12.41 4.231e-ll 377- 
393 


11 


PR00734 


GLYCOSYL HYDROLASE 
FAMILY 7 SIGNATURE 


PR00734I 11.46 4.296e- 
09 502-520 


12 


PFO0023 


Ank repeat proteins. 


PF00023B 14.20 6.500e- 
10 89-99 PP00023B 
14.20 2.636e-09 56-66 


14 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 3.848e- 
09 79-113 


15 


PR00208 


GLIADIN AND LMW GLUTENIN 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 9.8£8e- 
10 517-535 PR00208A 
12.59 2.233e-09 520- 
538 


17 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 8.200e- 
14 282-295 PD00066 
13.92 9.400e-14 477- 
490 PD00066 13.92 
6.500e-13 505-518 
PDOOOSfi 13.92 9.500e- 
13 254-267 PD00066 
13.92 1.429e-12 393- 
406 PD00066 13.92 
6.571e-12 421-434 


18 


BL00845 


CAP-Gly domain proteins. 


BL0084S 16.43 2.200e- 
25 55-80 


20 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 287-329 


21 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 S.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 348-390 


22 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BLQ0107A 18.39 3.250e- 
26 302-333 
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SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


23 


BL0O107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 3.250e- 
26 302-333 


2S 1 


BL0011.5 


Eukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins . 


BL00115T 8.45 7.273e- 
29 1208-1242 BL00115Q 
18.08 2.776e-21 953- 
983 BL00115Y 11.86 
8.000e-17 1604-1650 
BL00115M 19.19 8.130e- 
16 731-774 BL0011SH 
14.34 9.392e-16 463- 
496 BL00115A 15.44 
7.414e-15 43-82 
BL00115R 6.50 6.128e- 
14 983-1010 BLOOllSJ 
16.71 9.289e-14 591- 
617 BL00115I 8.33 
4 .336e-13 535-590 
BL00115L 12.25 5.939e- 
13 662-694 BL00115G 
11.65 6.011e-13 435- 
463 BL0011SK 15.03 
3.417e-10 617-659 
BL00115O 16.76 5.805e- 
10 8G3-913 BL00115P 
11.54 7.538e-10 913- 
953 BLOOllSS 18.24 
7.968e-10 1010-1052 
BL00115U 10.34 4.47Se- 
09 1242-126S 


26 


BL00420 


Speract receptor repeat 
proteins domain 
proteins. 


BL00420A 20.42 4.109e- 
11 81-110 BL00420A 
20.42 8.820e-10 84-113 


27 


BL00050 


Ribosomal protein L23 
proteins. 


BL00050A 23.71 9.250e- 
27 94-127 BL00050B 
14.81 8.125e-12 133- 
147 


28 


PR00925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925B 3.73 3.089e- 
10 41-54 


29 


PF00756 


Putative esterase. 


PF00756C 14.12 1.108e- 
09 486-516 


32 


BL00557 


FMN- dependent alpha - 
hydroxy acid 
dehydrogenases proteins. 


BL00557D 17.7$ 5.065e- 
37 274-316 BL00557A 
35.08 8.909e-29 24-73 
BL00557C 15.59 l.OOOe- 
28 227-257 BL00557B 
21.27 8.898e-22 130- 
169 


34 


PR00629 


SHC PHOSPHOTYROSINE 
INTERACTION DOMAIN 
SIGNATURE 


PR00629E 9.90 5.886e- 
35 299-328 PR00625F 
10.95 8.364e-32 334- 
361 PR00629B 13.66 
3.786e-27 224-247 
PR00629A 13.45 8.364e- 
21 206-222 PR00629C 
3.80 4.00De-12 249-261 
PR0O629D 12.45 3.739e- 
11 276-286 


35 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN „ 


PD01270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
PD01270D 24.66 3.700e- 
34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


36 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN . 


PD01270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PD01270D 24.66 3.700e- 
34 171-207 PD0127OC 
19.54 3.455e-30 137- 
166 


37 


BL00412 


Neuromodulin (GAP-43) 
proteins. 


BL00412C 10.28 9.241e- 
10 264-298 


38 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412C 10.28 9.241e- 
10 264-298 


39 


BL00412 


Neuromodulin (GftP-43) 
proteins. 


BL00412C 10.28 9.241e- 
10 254-298 


40 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR0O38OB 12.64 7.366e- 
14 342-360 PR00380C 
13 .18 6.927e-l3 375- 
394 PR00380D 9.93 
2.180e-12 429-451 
PR00380A 14.18 5.154e- 
12 143-165 


44 


BL0034S 


Eta-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 239-290 BL00345A 
13 .96 2.452e-l4 204- 
223 


45 


BL00345 


Ets- domain proteins. 


BL00345B 21.28 l.OOOe- 
40 215-266 BL00345A 
13 .96 2.452e-14 180- 
199 


46 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551A 15.63 3.53Be- 
26 172-202 DMOISSIC 
14 .62 3.571e-l7 232- 
252 DK01551B 8.84 
4.750e-ll 214-226 


47 


PR00876 


NEMATODE METALLOTHIONE IN 
SIGNATURE 


PR00876B 7.66 9.328e- 
11 246-260 


48 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 4.231e- 
33 6-45 


50 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 994-1019 BL00972A 
11,93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1020-1042 
BL00972C 16.48 7.000e- 
13 360-37S BL009723 
9.45 8.269e-l0 302-312 


51 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL0O972D 22.55 7.750e- 
19 990-1015 BL0O972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1016-1038 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


52 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 3.063e- 
14 10-54 


53 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 8.500e- 
17 20-38 PR00988F 
12 .23 7.828e-l5 196- 
210 PR00988C 13.64 
6.108e-14 104-120 
PR00988E 8.27 3.872e- 
11 174-186 PR009BBD 
5.95 6.878e-10 160-171 
PR00988B 11.60 2.915e- 
09 57-69 


55 


PR00762 


CHLORIDE CHANNEL 
SIGNATURE 


PR00762C 9.29 4.682e- 
21 294-314 PR00762D 
11.29 4.103e-19 509- 
530 PR00762A 14.22 
9.333e-18 199-217 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PR00762F 15.12 3.100e- 
16 563-583 PR00762B 
12.12 6.063e-l6 230- 
250 PR0C762E 12.07 
2.286e-15 54S-562 
PR00762G 14.13 6.276e- 
13 601-616 


56 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 8.800e- 
10 153-203 


58 


PF00791 


Domain present in ZO-l 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 2.049e- 
10 1080-1135 


59 


PF00791 


Domain present in ZO-l 
and UncS-like netrin 
receptors . 


PF00791B 28.49 2.049e- 
10 1062-1117 


61 


PD01929 


KINASE TYPE RESISTANCE 
ANTIBIOTIC TRANSFERASE 
AM. 


PD01929E 10.76 9.018e- 
09 206-221 


68 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 680-693 


69 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 670-683 


70 


PF00551 


BTB {also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 8.714e- 
10 51-64 


72 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.304e- 
09 108-118 


73 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239B 25.15 7.075e- 
12 118-166 


74 


BL00790 


Receptor tyrosine kinase 
class V proteins . 


BL00790N 13.25 S.116e- 
10 93-120 


76 


DM00471 


0 PROKARYOTIC DNA 
TOPOISOMERASE I. 


DM004 71A 11.73 9.357e- 
13 53-66 DM00471B 
8.45 4.S57e-12 70-81 


80 


PD02876 


DECARBOXYLASE 

PHOS PHATIDYLSER INE . 


PD02876C 8.80 2.723e- 
13 223-236 PD02876D 
12.13 2.588e-l2 334- 
351 


81 


PD02876 


DECARBOXYLASE 
PHOSPHATIDYLSERINE . 


PD02876C 8.80 2.723e- 
13 282-295 PD02876D 
12.13 2.588e-l2 393- 
410 


83 


BL00708 


Prolyl endopeptidase 
family serine proteins. 


BL00708B 24.91 7.197e- 
12 570-601 


84 


PRQ0014 


F I BRONECTI N TYPE III 
REPEAT SIGNATURE 


PR00014C 15.44 8.043e- 
09 985-1004 


86 


PR00678 


PI 3 KINASE PBS 
REGULATORY SUBUNIT 
SIGNATURE 


PR00678H 9.13 1.379e- 
09 246-269 


89 


PR60320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 8.200e- 
09 264-279 PR00320B 
12.19 B.650e-09 264- 
279 


93 


BL00455 


Putative AMP-blnding 
domain proteins. 


BL00455 13.31 2.58Be- 
14 316-332 


"95 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4,000e- 
10 123-154 


96 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
10 212-243 


97 


PR00081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.318e- 
13 134-146 PROO081A 
10.53 2.500e-12 54-72 


98 


.PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 5.500e- 
24 401-423 PR00380D 
S.93 7.188e-20 613-635 
PR003B0B 12.64 7.517e- 
16 529-547 PR00380C 
13.18 2.756e-13 560- 
579 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


102 


PR00300 


ATP -DEPENDENT CLP 
PROTEASE ATP -BINDING 
SUBUNIT SIGNATURE 


PR00300A 9.5t> 7.545e- 
x**± jtoy — ^uo 


104 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


oukjuh /ya x&.o/ o.786e- 
18 298-314 BL00479A 
19.86 4.9l3e-16 155- 
178 BL00479A 19.86 

BL00479B 12.57 6.294c- 
12 181-197 


106 


BL01019 


family proteins. 


bjuuiui9A 13.20 8.013e- 
12 43-83 


107 


DM01970 


O lew 7K<ci9 it vriDinf 
v A. ft aadj/ . xz, I JJxtJj.JL. 

END0S0MAL III. 


urmji9'/uii a.6D S.OOOe- 
16 403-416 


108 


BL00191 


Cytochrome b5 family, 
heme- binding domain 
proteins . 


3L00191K 17.38 4.951e- 
27 238-282 BL00191J 
11.37 6.447e-17 182- 
204 


109 


PD01066 


PROTEIN ZINC FINGER 
iXWu-i^.IINiGER METAL- 
BINDING NU. 


PD01066 19.43 4.938e- 
37 8-47 


110 


BL0113 8 


Scorpion short toxins 
proteins . 


BL01138A 10.96 8.297e- 
10 38-50 


113 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18 .39 S.SOOe- 
23 156-187 BL00107B 
13.31 9.1O0e-14 225- 
241 


117 


BL00214 


Cytosolic fatty-acid 

D^n^TnM n^n^ai net 


BL00214B 26.51 l.OOOe- 
17 46-91 BL00214A 
21.17 7.052e-ll 5-31 


118 


BL00107 


binding region proteins. 


BL00107A 18.39 8.560e- 
13 36-67 


ii9 


PRO0S29 


GONADOTROPHS RELEASING 
HORMONE RECEPTOR 
SIGNATURE 


PR00529C 11.03 7.506e- 
10 158-177 


120 


FRO 03 20 


G~ PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


121 


PRO 03 20 


REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


127 




Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 7.158e- 
13 216-241 


12B 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032C 6.14 3.195e- 
12 147-157 BL01032H 
11.25 5. 6800-11 318- 
331 BL01032G 8.33 
8.932e-ll 282-296 
BL01032I 10.42 8.902e- 
09 379-389 


129 


BL01310 


family proteins. 


BL01310 14.74 6.694e- 
26 28-64 


13 0 


P&00990 




PR00990B 12.32 9.534e- 
15 47-67 PR00990A 
J-b.xiJ D.DwOe-14 2D-42 
PR00990C 12.62 2.412e- 
09 119-133 


133 


BL00880 


Acyl - CoA-binding 
protein. 


BL00880 17.52 5.575e- 
26 72-122 


134 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 9.308e- "'" 
14 18-37 


135 


PR00215 


NEUROMODULIN SIGNATURE 


PR00215C 13.96 6.779e- 
10 475-496 


"136 


BL01310 


ATP1G1 / yiM i MAT 8 
family proteins. 


BL01310 14.74 2\432e- 
29 71-107 


140 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.882e- 
14 214-231 BL00028 
16.07 9.471e-14 102- 
119 BL00028 16.07 
2.800e-13 18-35 



195 



WO 01/53312 



PCT/USOO/34263 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








BL00028 16.07 S.SOOe- 
13 74-91 BL00028 
16.07 9.100e-13 186- 
203 BL00028 16.07 
8.043e-12 46-63 
BL00028 16.07 8.435e- 
12 130-147 BL00028 
16.07 9.217e-12 270- 
287 BL00028 16.07 
6.192e-ll 242-259 
BL00028 15.07 4.000e- 
10 156-175 


141 


BL0O5O1 


Signal peptidases I 
serine proteins. 


BL00501D 16.69 9.538e- 
14 113-133 BL00501C 
9.61 8.688e-10 89-101 


143 


BL01020 


SARI family proteins . 


BL01020C 15.35 7.722e- 
20 79-130 


146 


FD01066 


PROTEIN ZINC FINGER 
2INC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.400e- 
25 335-374 


149 


BL00126 


3 i 5 i ~ C y C li c nucleotide 

phosphodiesterases 

proteins. 


BL00126C 22.07 1.450e- 
25 509-550 BL00126E 
35.22 3 . 95le.-16 654- 
709 BL00126D 25.50 
1.360e-15 565-604 
BL00126B 15.20 8.200e- 
11 483-495 BL00126A 
27.56 8.269e-ll 442- 
479 


151 


BL00632 


Ribosomal protein S4 
proteins . 


BL00632 23.79 5.271e- 
20 106-149 


154 


BL00559 


Eukaryotic tnolybdopterin 

oxidoreductases 

proteins. 


BL00559I 13.63 5.304e- 
19 29-58 BL00559K 
13-17 2.957e-18 172- 
199 BL00559J 19.63 
8.38Se-13 99-151 
BL00559L 13.60 5.814e- 
12 241-259 


155 


PRO 044 9 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.692e- 
13 13-35 


157 


BL00406 


Actins proteins. 


BL00406D 12.58 2.547e- 
18 275-330 BL00406A 
9.95 5,776e-16 15-50 
BL00406B 5.47 7.429e- 
12 69-124 BL00406C 
6.75 9.682e-12 128-183 


160 


BL00132 


Zinc carboxypeptidases , 
zinc -binding region l 
proteins . 


BL00132A 26.07 7.000e- 
14 22-63 BL00132C 
21.35 3.466e-12 104- 
145 


165 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 9.043e- 
13 139-158 


tea " 

168 


BL00362 


Ribosomal protein Si 5 
proteins . 


BL00362 24.67 9.700e- 
15 129-172 


169 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 


BL00039D 21.67 l.OOOe- 
35 640-686 BL00039A 

IO A A. T OC4o-11 "ill 

251 BL00039B 19.19 
4.553e-13 378-404 
BL00039C 15.63 8.773e- 
12 465-489 


175 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.72le- 
12 14-36 


178 


BL01310 


ATP1G1 / PLM / MATS 
family proteins. 


BL01310 14.74 2.432e- 
29 133-169 


179 


PO01066 


PROTEAN 21NC PiNGER 
ZINC-FINGER METAL- 


PD61066 ±9.41 9.455e- 
36 6-45 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION ~ ~ 

BINDING NU. 


RESULTS* 


180 


PR00007 


^urif jjtirijsrg ^ Cly DOMAIN 
SIGNATURE 


PR00007B 14.16 7.429e- 
20 160-130 PR00007A 
19.33 4.938e-I9 133- 
160 PR00007C 15.60 
1.225e-15 206-228 
PR00007D 9.64 6.885e- 


181 


BL00027 


•Homeobox' domain 
proteins . 


BL00027 26.43 9.526e- 


182 


BIi00027 


' Homeobox ' domain 
proteins. 


BL00027 25.43 9.S26e- 
24 263-306 


183 


BL00027 


■Honeobox' domain 
proteins. 


BL00027 26.43 9.526e~ 
24 280-323 


184 


BL00027 


* Homeobox • domain 
proteins. 


BL00027 26.43 9.526e- 
24 263-306 


138 


PRO 092 9 


AT -HOOK-LIKE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 460-471 


189 


PR00929 


AT - HOOK- LI KE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 440-451 


190 


BL0G383 


Tyrosine specific 
protein phosphatases 
proteins. 


BL003B3F 15.51 7.188e- 
17 666-682 BL00383A 

13 .34 8,714e-17 162- 
177 BL00383E 10.35 

I. 000e-14 333-344 
BL00383E 10.35 7.300e- 

14 628-639 BL00383F 
15.51 1.720e-13 371- 
387 BL00383C 10.10 
3.000e-13 217-228 
BL00383D 11.92 7.000e- 
13 295-308 BL00383B 
7.61 1.692e-ll 187-196 
BL00383C 10.10 1.750e- 
09 509-520 BL00383D 

II. 92 4.000e-09 5B9- 
602 BL00383B 7.61 
8.000e-09 479-488 


191 


PRO045O 


RECOVER IN FAMILY 
biGKATURE 


PR0045OC 12.22 7.911e- 
15 83-105 PR00450C 
12.22 6.286e-13 47-69 


193 


P*'O05^4 


Octicosapeptide repeat 
proteino . 


PF00564B 24.74 6.164e- 
16 227-278 


194 


PRO 05 03 


BROKODOMAIN SIGNATURE 


PR00503D 20.81 9.1S6e^ — 
15 204-224 PR00503B 
9.56 9.571e-13 170-187 


195 


BL.00901 


synthase/cystathionine 
beta -synthase P- 
phosphate att. 


BL00901C 20.63 3.429e- 
18 67-117 


197 


BL00636 


Nt-dnatf domain proteins 


BL00636A 8.07 6.211&- 
17 40-57 BL00636B 
15.11 2.000e-13 67-88 


198 


PR00690 


ADHESIN FAMILY SIGNATURE " 


PR00690A 10.86 9.866e-~ 
09 463-482 


199 


BL01131 


Ribosomal RNA adenine 
dimethylases proteins. 


BL01131A 26.62 2.343eP : — 
12 84-130 


201 


PR00910 


IjUTEOVIRUS 0RF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.352e- 
12 509-522 


203 


DM00215 


CHOLINE -RICH PROTEIN 3. 


DM00215 19.43 2.286e- 
10 39-72 


206 


PR00261 


JbOW DENSITY LIPOPROTEIN 
CLDL) RECEPTOR SIGNATURE 

1 

: 
] 


PR00261A 11.02 4.462e- ' 
19 65-87 PR00261C 
11.37 9.308e-19 65-87 
?RO0261D 12.47 2.667e- 
L8 65-87 PR00261B 
L4.12 4.000e-18 143- 
L65 PR00261A 11.02 
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SEQ ID NO: 


"accession 

NO. 


DESCRIPTION 


RESULTS* 








4.833e-18 143-165 
PR00261D 12.47 7.500e- 
18 143-165 PR00261B 
14.12 5.06Se-16 65-87 
PR00261C 11.37 8.967e- 
16 143-165 PR002S1F 
11.57 4.938e-13 143- 
165 PR00261E 11.08 
7.188e-13 65-87 
PR00261F 11.57 7.188e- 
13 65-87 PR0Q261E 
11.08 1.643e-ll 143- 
165 


209 
211 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 6\l43e~ — 
13 118-173 PF00791C 
20.98 7.680e-10 132- 
171 


212 


PR00G07 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007A 19.33 S.7Sle- 
19 131-158 PR00007B 
14.16 4.115e-18 156- 
178 PR00007C 15.60 
1.675e-15 201-223 
PR0D007D 9.64 7.231e- 
11 233-244 


213 


I ' ■ 'V V JL O J 


Ubi qu 1 1 in - con j uga t ing 
enzymes proteins. 


BL00183 28.97 1.545e- 
30 43-91 


215 


BLO0183 


uoiqui t in- con j uga t i ng 
enzymes proteins. 


BL00183 28.97 1.545e- 
30 43-91 


217 


BL00039 


DEAD -box subtamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 1.900e- 
29 568-614 BL00039A 
18.44 1.871e-23 21-60 
BL00039C 15.63 1.720e- 
11 364-388 BL00039B 
19.19 4.064e-ll 277- 
303 


219 


BL00100 


CiiAorsnnpiieni col 
acetyl transferase 
proteins. 


BL00100D 17.22 8.484e- 
09 68-106 


222 


PR00213 


MYELIN PO prcvtftkt 
SIGNATURE 


FR00213C 15.94 3.969e- " 
11 199-227 


224 


BL00678 


Trp-Asp (WD.) xepeat 
proteins proteins. 


BL00678 9.67 1.947e-09 
144-155 




PR00875 


mwkuuou rtalJ\XiLiU LtiXKJSStSLN 

SIGNATURE 


PR00875A 5,83 l.OOOe- 
09 901-913 


225 


BL0063 6 


Nt-dnaJ" domain proteins. 


BL00636B 15.11 8.200e-' 
19 18-39 


226 
229 


BL0 063 4 


Nt-dnaJ domain proteins Z 


BL00636A 8.07 1.000c- 
21 21-38 BL00636B 
IS. 11 8.200e-19 45-66 




PR00301 


70 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00301F 13.98 7.563e- 
13 329-346 PR00301G ' 
13.78 4.300e-12 361- 
382 


230 


BL 00460 


Glutathione peroxidases 
selenocysteine proteins . 


BL00460A 28.67 8.773e- " 
20 35-70 BL00460B 
9.73 7.429e-16 78-96 
BL00460C 14.35 2.831e- 
12 111-134 BL0046OD 
16.89 8.773e-ll 140- 
160 


231 ] 
233 " 


PR00647 


5ENR ORPHAN RECEPTOR 
SIGNATURE 


PR00647B 10.19 8.522e- 
09 273-287 




3L00292 


Myelins proteins . ] 


3L00292B 20.31 7.429e- " 
27 244-275 BL00292A 
22.87 7.750e-27 201- 
135 


234 "I 


J R00449 ^ 
I 


CHANS FORM ING PROTEIN P21 1 
*AS SIGNATURE J 


>R00449A 13.20 6.308e- 
.3 7-29 PR00449C 
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235 



ACCESSION 
NO. 



PR00019 



236 



237 



240 



241 



244 



PRO 0019 



PD00289 
PRO 0 011 



PRO 0011 
BL00903" 



245 



248 



250 



DM00179 
BL00246 



PR00927 



DESCRIPTION 



RESULTS* 



LEUCINE -RICH REPEAT 
SIGNATURE 



LEUCINE -RICH REPEAT - 
SIGNATURE 



PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 



TYPE III EGF-LIKE 
SIGNATURE 



TYPE III EGF-LIKE 
SIGNATURE 



17.27 4.4fe2e-ll 47-70 
PR00449D 10.79 7.120e- 
11 109-123 



PR00019B 11.36 7.300e- 
10 251-265 PR00019B 
11.36 5.320e-09 119- 
133 PR00019B 11.36 
l-000e-08 229-243 



PR00019B 11 .36 7.3000^ 
10 245-259 PR00019B 
11.36 5.320e-09 113- 
127 PR00019B 11.36 
1.000e-08 223-237 



PD00289 9.97 8.448e~09~ 
67-81 



PRO0011D 14.03 3.492e- 
10 616-635 



cytidir.e and " 
deoxycytidylate 
deaminases zinc -binding 
region s. 



w KrNASE ALPHA ADHESION 

T-CELL. 

wnt-l family proteins. 



PR00011D 14.03 3.492e- 
10 616-635 



BLO0903 12.93 8.941e- 
12 54-64 



ADENINE NUCLEOTIDE 

TRANSLOCATOR 1 SIGNATURE 



DM00179 13.97 8.043e- 
09 124-134 



BL00246D 23.97 l.OOOe- 
40 186-239 BL00246E 
20.32 1.000e-40 305- 
351 BL00246B 13.69 
4.176e-36 105-140 
BL00246A 15.75 2.286e- 
24 70-90 BL00246C 
15,56 4.857e-22 150- 
175 



PR00927E 14.93 5.114e- 
10 253-275 



255 



PD01796 



255 



BL50002 



AAA-protein family 
proteins . 



PROTEIN TRANSMEMBRANE 

COBALT ZINC CAD MIU. 
Src homology 3 (SH3) 
domain proteins profile. 



BL00674B 4.46 l.OOOe- 
09 223-245 



PD01796 15.01 6\045e- 
09 61-88 



BL50002B 15.18 2.800e- 
10 421-435 



259 



ADENYLATE KINASE 
SIGNATURE 



BLOC 8 92 



~262~ 



BL00388 



"2TT 



267" 



HIT family proteins . 



Froteasome A- type 
subunits proteins. 



BL00903 



BL00107 
BL0022 6" 



Cytidine and 
deoxycy t idy 1 a t e 
deaminases zinc -binding 
region s. 



PR00094C 12.94 2.200e"- 
18 87-104 PR00094D 
12.52 2.731e-14 161- 
177 PR00094A 10.31 
5.500e-14 11-25 
PR00094B 11.01 4.115e- 
13 39-54 PR00094E 
11.25 7.333e-13 178- 
193 



BLOOay^A 18.17 5.500e- 
13 60-91 



BL00388A 23.14 l.OOOe-" 
40 8-54 BL00388B 
31.38 3.864e-33 66-108 
BL00388D 20.71 l.OOOe- 
21 153-184 BL00388C 
18,79 8.147e-16 126- 
148 



Protein Kinases ATP- 
binding region proteins 
intermediate filaments 



proteins. 



BL00903 12.93 5.821e- 
09 91-101 



BL00107B 13.31 1.529e- 
09 241-257 



BL00226D 19.10 l.OOOe- 
37 362-409 BL00226B 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








23.86 8.043e-35 196- 
244 BL00226C 13.23 
7.000e-20 261-292 
BL00226A 12.77 6.143e- 
15 96-111 


271 


PD02952 


KINASE TRANSFERASE 
CHOLINE PROTEIN 
MULTIGENE FAMI. 


PD02952C 15.76 9.731e- 
16 235-265 PD02952B 
15.57 5.625e-09 215- 
229 


272 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 l.OOOe- 
40 106-160 PD02929B 
18.36 8.800e-17 179- 
199 


2 74 


BL01027 


Glycosyl hydrolases 
family 39 proteins. 


BL01027B 15.34 3.486e- 
09 213-250 


275 


PR00424 


ADENOSINE RECEPTOR 
SIGNATURE 


PR00424D 14.32 6.451e- 
11 39-59 


277 


BL00052 


Ribosonial protein S7 
proteins. 


BL00052A 27. 8S fi.OOOe- 
13 137-184 BL00052B 
15.17 5.143e-12 208- 
235 


279 


BLQ0790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 5.659e- 
13 267-294 


280 


PR00319 


BETA G- PROTEIN 
CTRANSDUCIN) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 1.000e-21 89-105 
PR00319A 15.27 8.364e- 
21 51-68 PR00319B 
11.47 8.200e-l9 70-85 


281 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PRO0319D 11.64 6.625e~ 
23 94-112 PRO0319C 
13.41 l.OOOe-21 76-92 
PR00319A 15.27 8.364e- 
21 38-55 PR00319B 
11.47 8_200e-19 57-72 


287 


PF00929 


Exonuclease . 


PF00929D 16.17 7.366e- 
09 149-163 


291 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2-360e- 
09 93-127 


292 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 
09 93-127 


294 


PD00066 


PROTEIN ZINC- FINGER 
METAL - BIND I . 


PD00066 13.92 8-714e- 
12 203-216 


295 


BL0OQ28 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.-07 S.SOOe- 
15 322-339 BL00028 
16.07 9.471e-14 4'33- 
450 BL00028 16.07 
4.600e-13 648-665 
BLO0O28 16.07 5.500e- 
13 760-777 BL00028 
16.07 9.550e-13 788- 
805 BL00028 16.07 
3.348e-12 704-721 
BL00028 16.07 6.478e- 
12 461-478 BL0002B 
16.07 8.435e-12 844- 
861 BL00028 16.07 
1.692e-ll 593-610 
BL00028 16.07 2.038e- 
11 211-228 BL00028 
16.07 5.154e~ll 732- 
749 BLO0028 16.07 
5.846e-ll 377-394 
BL00028 16.07 6.885e- 
11 816-833 BL00028 
16.07 7.231e-ll 676- 
693 BLO0028 16.07 
9.6S4e-ll 564-581 
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SEQ ID WO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00028 16.07 4.086e- 
09 517-534 BL00028 
16.07 7.429e-09 489- 
506 


296 


BL0021S 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15.82 B.333e- 
16 111-13S BL00215A 
15.82 2.723e-ll 10-35 
BL00215B 10.44 9.526e- 
11 152-165 BL00215B 
10.44 7.375e-10 59-72 
BL0021SA 15.82 9.824e- 
10 205-230 


302 


PF00953 


Glyco3yl transferase. 


PF00953C 19.70 8.773e- 
34 236-269 PF00953A 
19.S8 5.000e-25 102- 
129 PF00953B 6.17 
I.000e-13 182-194 


304 


PF00152 


tRNA synthetases class 
II . 


PF00152D 21.30 8.364e- 
28 422-461 PF00152C 
28.03 9.250e-2l 220- 
257 PF00152B 15.67 
2.6S8e-13 159-184 
PF00152A 19.68 5.714e- 
11 44-67 


305 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 8.250e- 
35 37-76 


305 


PD02784 


PROTEIN NUCLEAR 
R I BONUCLEOPROTE I N . 


PD02784B 26.46 S.840e- 
09 92-135 


307 


PR00454 


"ETS DOMAIN SIGNATURE^ 


PR00454C 11.24 7.80Se- 
09 1167-1186 


303 


PRO 02 3 7 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237E 13.03 5.09le- 
13 188-212 PR00237G 
19.63 7.207e-13 268- 
295 PR00237A 11.48 
4.375e-ll 24-49 
PR00237C 15.69 3.057e- 
10 101-124 PR00237D 
8.94 4.750e-10 137-159 
PR00237F 13.57 5.364e- 
10 230-255 PR00237B 
13.50 9.438e-10 57-79 




DT A Ac *n 


DNA polymerase family x 
proteins. 


BL00522C 11.90 7.577e- 
24 315-339 BL00522F 
14.90 1.310e-15 470- 
494 BL00522A 25.52 
1.265e-14 179-226 
BL0 0522E~19.63 B.615e- 
14 430-460 BL0052,2B 
27,30 9.625e-12 267- 
313 


310 


BL0f>32£ " * 


Tropomyosins proteins 


njjuirj^biJ o* /b b.235e- 
10 856-897 


312 


3L00290 


IlWTHmocrlotviil in pi anrf 

major histocompatibility 
complex pro t s ins. 


sJLiuUZSOA 20.89 4.706e- 
14 151-174 BL00290B 
u . ± / a .uuue-12 211- 
229 


313 


BL0034S 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 34-85 BL00345A 
13.95 9.217e-16 1-20 


315 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF0065I 15.00 5.091e- 
15 63-76 


317 


BL01020 


SARI family proteins. 


BL01020C 15.35 3.198e- 
17 79-130 


318 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.696e- 
11 164-214 


320 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 


PR00109B 12.27 4.814e- 
10 216-235 J 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 






SIGNATURE 




321 


BL00027 


1 Homeobox ' domain 
proteins . 


BL00027 26.43 5.688e- 
10 329-372 


322 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 8.765e- 
12 558-577 


324 


BL01241 


Link domain proteins. 


BL01241 35.81 8.313e- 
30 183-236 BL01241 
35.81 3.222C-13 282- 
335 


326 


BL00412 


Neuromodul in (GAP- 4 3 ) 
proteins . 


BL00412D 16.54 4.000e- 
12 515-566 BL00412D 
16.54 5.7D5e-ll 516- 
567 BL00412D 16.54 
7.848e-10 518-569 
BL00412D 16.54 1.827e- 
09 514-565 BL00412D 
16.54 1.918e-09 513- 
564 BL00412D 16.54 
2.102e-09 520-571 


328 


BL002 32 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.557e- 
20 151-199 BL00232B 
32.79 2.246e-18 41-89 
BL00232B 32.79 5.985e- 
18 370-41B BL00232B 
32.79 5.500e-16 258- 
306 BL00232B 32.79 
9.384e-15 475-523 
BL00232C 10.65 2.537e- 
12 256-274 BL00232C 
10.65 4 .326e-ll 368- 
386 BL00232C 10.65 
7.261e-ll 473-491 
BL00232C 10.65 7.457e- 
11 39-57 


33 0 


PRO 04 54 


KTS DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


331 


BL00598 


Chromo domain proteins. 


BL00598 14.45 B.393e- 
18 27-49 


333 


BL01016 


Glycoprotease family 
proteins. 


BL01016C 22.84 3 . 925e- 
32 70-115 BL01016E 
14.88 5.286e-19 149- 
177 BL01016H 13.71 
7.577e-13 291-301 
BL01016D 8.86 3.298e- 
11 127-140 BL01016G 
7.14 5.622e-10 261-271 
BL01016A S.6S 7.167e- 
10 4-19 BL01016F 
13.34 1.563e-09 200- 
212 BL01016B 8.93 
8.855e-09 38-50 


339 


BL01115 


GTP-binding nuclear 
protein ran proteins . 


BL01115A 10.22 5.500e- 
11 17-61 


340 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 1.231e- 
33 10-49 


341 




Kmesin light cnain 
repeat proteins. 


BL01150B 19.54 5-042e- 
09 55-109 


342 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.400e- 
30 16-55 


343 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 l.OOOe- 
40 20-68 


346 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12-27 4.764e- 
11 135-154 


347 


PROO109 


TYROSINE KINASE 


PR00109B 12.27 4.764e- 
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SEQ ID WO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






CATALYTIC DOMAIN 
SIGNATURE 


11 135-154 


351 


BL01187 


Calcium-binding EGF-like 
domain proteins pattern 
proteins. 


BL01187B 12. D4 1.783e- 
13 100-116 BL01187B 
12.04 8.435e-13 276- 
292 BL01187B 12.04 
8.800e-ll 13-29 
BL01187B 12.04 7.429e- 
10 54-70 BL01187B 
12.04 5.725e-09 231- 
247 BL01187A 9.98 
7.000e-09 255-267 


352 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 S.950e- 
10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


354 


BL00380 


Rhodanese proteins . 


BL00380F 9.76 6.694e- 
11 542-553 


355 


PF00628 


PHD- finger . 


PF00628 15.84 l.OOOe- 
11 116-131 


356 


PR00587 


SOMATOSTATIN RECEPTOR 
TYPE 1 SIGNATURE 


PR00587A 8.06 9.70 0e- 
09 17-37 


359 


PD00066 


PROTEIN ZINC- FINGER 
METAL -BIND I . 


PD00066 13.92 4.462e- 
15 261-274 PD0006S 
13.92 6,500e-13 233- 
246 PD00066 13.92 
4.300e-09 289-302 


361 


PF00791 


Domain present in ZO-l 
and Unc5-like net r in 
receptors . 


PF00791B 28.49 9.604e- 
13 54-109 PF00791B 
28.49 1.095e-12 21-76 
PF00791A 27.85 1.432e- 
09 71-126 PF00791B 
28.49 7.440e-09 184- 
239 


362 


PF00791 


Domain present in ZO-l 
and UncS-like netrin 
receptors . 


PF00791B 28.49 2.273e- 
11 279-334 


363 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 5.080e- 
10 73-95 PR00450C 
12.22 3.27Be-09 109- 
131 


364 


PF00242 


DNA polymerase (viral) 
N- terminal domain 
proteins . 


PF00242Q 13.51 2.328e- 
09 22-68 


365 


PF00242 


DNA polymerase (viral) 
N- terminal domain 
proteins . 


PF00242Q 13.51 2.328e- 
09 22-68 


366 


BL0116O 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 1038-1092 


367 


PRO 0019 


LEUCINE - RICH REPEAT 
SIGNATURE 


PR00019B 11.36" 1.360e- 
09 229-243 PR00019B 
11.36 6.040e-09 91-105 
PR00019A 11.19 8.667e- 
09 370-384 


368 


PRO 0 Oil 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllD 14.03 9.000e- 
15 30-49 PROOOllA 
14.06 9.830e-15 30-49 
PROOOllB 13.08 4.500e- 
14 30-49 PROOOllC 
24.25 5.143e-09 6-35 


369 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032H 11.25 4.150e- 
09 417-430 


372 


BL0047B 


LIM domain proteins. 


BL00478B 14.79 7.750e- 
12 410-425 


373 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.757e- 
34 26-65 


376 


PR00170 


SODIUM CHANNEL SIGNATURE 


PR00170E 6.48 2.739e- 
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ACCESSION 
NO. 



BL00107 



DESCRIPTION 



Protein kinases ATP- 
binding region proteins. 



RESULT'S* 



10 88-118 



BL00107A 18.39 l.OOOe- 
23 276-307 BL00107B 
13.31 1.692e-12 342- 
358 



BL00455 
PRO 06 24 



PD00078 



PR00511 



PDQ2870 



PD00066 



BU)0290 



mutative AMP -binding 
domain proteins. 



HI STONE H5 SIGNATURE 



BL00455 13.31 S.714e- 
12 50-66 



REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 



PR00624G 4.08 4.900e- 
09 524-544 



PD00078B 13.14 5.950e- 
10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 



TEKTIN SIGNATURE 



RECEPTOR INTRRIiEUKIN-1 
PRECURSOR. 



PROTEIN ZINC- FINGER 
METAL- BIND I . 



PR00511D 7.11 5.371e- 

09 67-80 

PD02870B 18.83 6.000e- 

10 97-130 



Immunoglobulins and 
major histocompatibility 
complex p roteins . 
Mitochondrial energy 



PD00066 13.92 S.OOOe- 
13 316-529 



BL00290A 20.89 7.657e- 
09 151-174 



BL00215 



transfer proteins. 



BL00215A 15.82 5.200e- 
15 221-246 BIi0021SA 
15.82 7.618e-14 20-45 
BL00215A 15.82 8.85le- 
11 123-148 BL00215B 
10.44 9.526e-ll 69-82 
BI>00215B 10.44 7.300e- 
09 272-2B5 BLQ0215B 
10.44 8.500e-09 165- 
178 



BL00674 
PRO 004 8 



PR00761 



BL00240 



PF00676 



BL00514 



AAA-protein family 
proteins. 



C2H2-TYPE 2 INC FINGER 
SIGNATURE 



BL00674B 4.46 2.723e- 
16 299-321 



BIND IN PRECURSOR 

SIGNATURE 

Receptor tyrosine kinase" 

class III proteins. 

Dehydrogenase El 
component . 



PR00048A 10.52 8.579e- 
11 141-155 



Fibrinogen beta and 
gamma chains C- terminal 
domain proteins. 



PR00761B 9.93 6.764e- 
09 55-74 



BL00240B 24.70 7.907e- 
10 118-142 



PF00676B 24.71 8 . 071e- 
18 331-369 PF00676D 
14.40 3.854e-15 486- 
506 PF00676C 16.88 
9.182e-14 454-478 



BL00S14C 17.41 4.673e- 
28 4432-4469 BL00514G 
15.98 6.092e-14 4555- 
4585 BL00514D 15.35 
2.532e-12 4473-4486 
BL00514F 11.65 4.288e- 
10 4519-4534 BL00514H 
14.95 4.955e-10 4584- 
4609 



A04 



405 



PF00992 



PR00019 



BL00232 



Troponin . 



IiEUCINE-RICH REPEAT 
SIGNATURE 



FF00992A 16.67 5.974e- 
09 105-140 



Cadherins extracellular" 
repeat proteins domain 
proteins. 



PR00019B 11.36 1.450e- 
10 73-87 PR00019A 
11.19 8.043e-10 76-90 
PR00019B 11.36 l.OOOe- 
09 50-64 PR00019B 
11.36 1.000e-09 96-110 



BL00232B 32.79 9.5.57e- 
20 13 9-187 BL00232B 
32.79 2.246e-18 29-77 
BL00232B 32.79 5.985e- 
18 358-406 BL00232B 
32.79 5.500e-16 246- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








294 BL00232B 32.79 
9.384e-l5 463-511 
BL00232C 10.65 2.537e- 
12 244-262 BL00232C 
10.65 4.326e-ll 356- 
374 BL00232C 10.65 
7.261e-ll 461-479 
BL00232C 10.65 7.437e- 
11 27-45 


407 


PF00426 


Outer Caps id protein VP4 
( Hemagglutinin) . 


PF00426S 15.67 5.634e- 
09 902-940 


409 


Bti01l60 


Klnesin light chain 
repeat proteins. 


BL01160B 19.54 9.695e- 
09 126-180 


4X0 


BL00741 


Guani ne - nu c 1 eo t ide 
diasociation stimulators 
CDC24 family sign. 


BL00741B 14.27 2.731e- 
09 252-275 


411 


PF00646 


F-box domain proteins. 


PF00646A 14.37 6,344e- 
09 86-100 


412 


BL00603 


Thymidine kinase 
cellular-type proteins. 


BL00603B 11.39 8.500e- 
09 542-557 


415 


B&00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins . 


BL00866B 36.29 3.571e- 
31 245-291 BL00866C 
23.25 9.000e-25 331- 
366 


418 


PR.00239 


M0LLUSCAN RHOD0PSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 6.114e- 
09 590-602 


421 


PP00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 7.955e- 
14 23-78 PF00791B 
28.49 3.653e-12 273- 
328 PF00791B 28.49 
4.273e-ll 156-211 
PF00791B 28.49 7.818e- 

28.49 1.524e-10 56-111 
PF00791C 20*98 3.559e- 
09 37-76 PF00791C 
20.98 5.235e-09 170- 
209 PFO0791C 20.98 
5.235e-09 381-420 
PF00791B 28.49 6.202e- 
09 189-244 PF00791B 
28.49 7.028e-09 435- 
490 PF00791B 28.49 
8.679e-09 367-422 


424 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 7.207e- 
28 1645-1679 


425 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 5.881e~ 
10 228-251 


429 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.600e- 
11 31-40 


431 


BL00039 


DEAD- box subfamily ATP- 
dependent he li cases 
proteins. 


BL00039D 21.67 1.844e- 
34 490-536 BL00039A 
18.44 5.6l5e-19 205- 
244 BL00039B 19.19 
8.920e-l€ 251-277 
BL00039C 15.63 5.781e- 
1S 333-357 


432 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 7.652e- 
12 169-185 


433 


PR00828 


FORMIN SIGNATURE 


PR0082BB 5.23 8.218e- 
10 382-405 


436 


BLO0415 


Synapsins proteins. 


BL00415N 4.29 8.643e- 
11 195-239 BL00415N 
4.29 3.036e-09 809-853 


443 


PR00834 


HTRA/DEGQ PROTEASE 
FAMILY SIGNATURE 


PR00834F 10.91 6.040e- 
11 221-234 


446 


PF01140 


Matrix protein (MA) , 


PF01140D 15.54 9.663e- j 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 








P15. 


10 183-218 PF01140D 
15.54 3.093e-09 246- 
281 


449 


PR00S68 


DOPAMINE D3 RECEPTOR 
SIGNATURE 


PRC0568G 13.95 5.551e- 
09 39-53 


45X 


PF00084 


Sushi domain proteins 
(SCR repeat proteins. 


trc\HJ\J O'io j,*to J . BUS - 

10 47-59 


452 


BL0079O 


Receptor tyrosine kinase 
class V proteins . 


BL0079DI 20.01 2.82le- 
09 618-649 


456 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 l.COOe- 
25 77-99 PR00380D 
9.93 1.000e-21 281-303 
PR00380C 13.18 8.286e- 
17 230-249 PR00380B 
12.64 4.724e-16 194- 
212 


457 


PR00253 


wu T JiOf\-.ftl y l J. £JUdU J, XK1L AC- 1 u 
(GABA) RECEPTOR 


PR00253A 9.15 9.143e- 
24 246-267 PR00253B 
13.47 2.000e-23 272- 
294 PR00253C 13.85 
7.000e-23 306-328 
PR00253D 16.68 5.950e- 
4± 452-473 


467 


PR00849 


GLYCOSYL HYDROLASE 
FAMILY 58 SIGNATURE 


PR00849D 9.77 9.236e- 
uy yj.0-937 


471 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 8.200e-12 
33-44 


472 


BL00226 1 


LCJt uic uiaiu t ii a. men u s 
proteins . 


BL00226B 23.86 3.72le- 
09 282-330 


473 


BL00344 


GATA-type zinc finger 
domain proteins. 


BL00344 17.99 7.000e- 
12 814-852 


474 


BL00481 


Thiol -activated 

CVtolvsin? ■nr-nt-^'i n« 


BL00481E 13.07 8.909e- 
0? 1 /j -199 


479 


PR00319 


BETA G- PROTEIN 
< TRANS DUC IN) SIGNATURE 


PR00319B 11.47 2.571e- 
09 393-403 


480 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 1.900e- 
38 8-47 


481 


PR00405 


HIV Kiiv 114 i£iKtt.Ci 

PROTEIN SIGNATURE 


PR00405C 19.41 l.OOOe- 
19 451-473 PR00405B 
11.83 4.333e-18 430- 
448 PR0040SA 17.71 
4.971e-18 411-431 


4S2 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.286e- 
10 959-974 PR00049D 
0.00 9.857e-10 958-973 
PR00049D 0.00 1.305e- 
09 937-952 PR00049D 
0.00 8.322e-09 939-954 


48S 


PR00007 


SIGNATURE 


PR00007B 14.16 8.615e- 
23 653-673 PR00007A 
19.33 6.192e-22 626- 
653 PR00007C 15.60 
5 ( 846e-19 698-720 

13 732-743 


487 


PD0056 7 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


09 200-214 


488 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e- " 
12 3-21 


489 


PD01066 


PROTEIN ZINC FINGER 
2 INC -FINGER METAL - 
BINDING NU. 


PD01066 19.43 4.882e- 
27 30-69 PD01066 
19.43 3.430e-lb 71-110 


490 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


^R00049D 0.00 7.864e- 
09 663-678 


492 


BL01128 


Shikimate kinase 
proteins . 


BL01128A 18.84 6.464e- 
17 58-92 


497 


PF00429 


ENV polyprotein (coat 


PF00429 31.08 7.171e- 
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ID riiJl 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






polyprotein) . 


15 21-71 


498 


BL00120 


Lipases, serine 
proteins . 


BL00120B 11.37 7.923e- 
09 185-200 


500 


BL00 03 0 


Eukaryotic RNA- binding 
region RNP-i proteins. 


BL00030A 14.39 7.353e- 
11 299-318 


501 


BL01159 


WW/rsp5/WWP domain 
proteins . 


BL01159 13.85 8.579e- 
12 131-146 


505 


BL00021 


Kringle domain proteins. 


BL00021B 13.33 3.739e~ 
17 492-510 


508 


PRO 012 0 


H+ TRANS PORTING AT PAS E 
K ±*KUI ON PUMP ; S IGNATDRE 


PR00120C 9.90 5.800e- 
19 705-722 


509 


DM 014 17^ 


6 kw INDUCING XPMC2 
MUSHROOM SPAC22G7.04. 


DM01417B 20.62 2.938e- 
16 362-395 DM01417D 
11. 0B 3.800e-13 322- 
338 


510 


PF00534 


Glycosy 1 t rans f era s e s 
group l. 


PF00534B 14.47 6.625e- 
09 346-370 


511 


PF00534 


Glycosyl transferases 
group 1. 


PF00534B 14.47 6.625e- 
09 293-317 


512 


PF00534 


Glycosyl transferases 
group 1, 


PF00534B 14.47 S.625e- 
09 366-390 


513 


PD01841 


PHOSPHORYLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 110-160 PD01841B 
14.35 l.OOOe-40 181- 
222 PD01841D 17.87 
l.OOOe-40 243-295 
PD01841F 13.36 l.OOOe- 
40 333-382 PD01841G 
24.26 l.OOOe-40 386- 
440 PD01841L 18.42 
1.000e-40 968-1010 
PD01841I 23.00 4.545e- 
37 762-804 PD01841E 
18.60 3.750e-36 295- 
333 PD01841J 14.94 
6.023e-35 851-888 
PD01841H 21.30 2 , 909e- 
33 490-527 PD01841K 
14.81 7.083e-33 924- 
954 PD01841C 13.78 
9.3B6e-23 222-243 
PD01841M 10.82 8.594e- 
21 1054-1073 PD01841I 
23-00 2.667e-13 549- 
591 


514 


PR00153 


CYCLOPHILIN PEPTIDYL- 

PROLYL CIS -TRANS 

I SOME RASE SIGNATURE 


PR00153C 11.01 7.188e- 
13 95-111 PR00153E 
9.10 4.150e-12 122-138 


515 


BL00740 


MAM domain proteins. 


BL00740A 13.87 7.188e- 
12 410-423 


516 




3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 6.087e- 
12 1018-1052 


517 


BL00242 


Integrins alpha chain 
proteins. 


BL00242C 16.86 8.320e- 
09 12-42 


523 


unv UU j X 


IMMUNOGLOBULIN V REGION.- 


DM00031A 16.80 3.750e- 
39 20-68 DM00031B 
15.41 1.000e-25 84-118 


525 " * 




Amyloidogenic 
glycoprotein 
extracellular domain 
proteins . 


BL00319C 17.12 8.375e- 
10 61-95 


526 


PF00789 


Domain present in 
ub iqu i t i n - r egu la t ory 
proteins . 


PF00789B 19.70 3.308e- 
12 322-343 PF0O789C 
20.98 5.269e-09 367- 
392 


528 


BL01162 


Qu inane oxidoreductase / 
zeta-crystallin 
proteins . 


BL01162C 22.80 l.SOOe- 
16 120-164 
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eon Tn mo . 


ACCESSION 
NO. 


; DESCRIPTION 


RESULTS* 


529 


PR00910 


LUTEOVIRUS 0RP6 PROTEIN 
SIGNATURE 


PR0O91OA 2.51 3.893e- 
09 60-73 


532 


BL00215 


Mitochondrial energy- 
transfer proteins. 


BL00215A l£ . 82 4.000e- 
17 11-36 BL00215A 
15.82 8.660e-ll 123- 
148 


533 


BL00215 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15.82 4.000e~ 
17 11-36 BL00215A 
15.82 8.660e-ll 97-122 


534 


BL00098 


Thiolases acyl-enzyme 
intermediate proteins. 


BL00098C 21.65 2.800e- 
38 181-227 BL00098B 
32,59 5.345e-38 86-141 
BL0009BD 26.30 8.364e- 
35 245-288 BL00098E 
22.12 1.000e-34 314- 
352 BLOO098F 10.18 
4.971e-22 365-386 
BL00098A 10.60 6.455e- 
11 38-50 


53$ 


PR00370 


FLAVIN- CONTAINING 
M0N00X YGENAS E (FMO) 
SIGNATURE 


PR00370E 11.96 7.429e- 
22 321-340 PR00370D 
16.33 6.143e-21 185- 
204 PR00370F 17.75 
6.559e-21 376-396 
PR00370B 10.91 9.59le- 
21 27-46 PR00370C 
12.72 3.500e-20 140- 
157 PR00370A 3.35 
6.442e-17 4-20 


536 

t 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BLO0028 16.07 7.429e- 
16 285-302 BL00028 
16.07 6.294e-14 341- 
358 BL00028 16.07 
1.346e-ll 369-336 
BL0C028 15.07 1.692e- 
11 397-414 BL00028 
15.07 4.4S2e-ll 453- 
470 BL00028 16.07 
7.231e-ll 425-442 
BL00028 16.07 4.300e- 
10 313-330 


537 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 844-881 


538 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 819-856 


539 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 822-859 


540 


PR00985 


LEUCYL-TRNA SYNTHETASE 
SIGNATURE 


PR00985A 12.10 9.0G0e- 
10 357-375 


541 


PD02102 


SUB UN IT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL. 


PD02102A 16.74 l.OOOe- 
40 3-47 PD02102B 
18.28 4.375e-34 57-100 
PD02102D 21.69 1.923e- 
30 179-218 PD02102C 
26.34 8.929e-26 100- 
146 


543 


BL00Q28 


zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 l.OOOe- 
10 48-65 BL00028 
16.07 6.400e-10 193- 
210 BL00028 16.07 
1.000e-09 343-360 
BL00028 16.07 6.914e- 
09 78-95 


545 


BL00250 


TGF-beta family 
proteins. 


BL00250A 21.24 8.000e- 
31 293-329 BL00250B 
27.37 5,286e-24 354- 
390 


547 


PR00319 


BETA G- PROTEIN 


PR00319B 11.47 2.714e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






(TRANSDUCIN) SIGNATURE ~ 


09 106-201 PR00319A 
15.27 7.344e-09 210- 
227 


548 




NF-kappa-B/Rel/ dorsal 
domain proteins . 


BL01204A 17.74 l.OOOe- 
40 8-56 BL01204D 
16.42 l,000e-40 177- 
221 BL01204E 13.83 
7.652e-33 225-250 
BLO1204C 13.93 8.714e- 
22 141-150 BL01204B 
15.41 4.333e~16 102- 
116 


549 


PR0O326 


GTP1/OBG GTP-BINDING 

"PPfVPFTW T71MTT V CTV^XTTiTTmi* 


PRO0326A 8.75 8.364e- 
15 255-276 


551 


PF0O632 


HECT- domain (ubiguitin- 


PF00632C 20.66 3.302e- 
23 1569-1601 PF00632B 
18.45 3.700e-21 1515- 
1543 


554 


BL00290 


imiuunoyioouiins ana 
major histocompatibility 
complex proteins. 


BL00290B 13.17 1.600e- 
14 187-205 BL0029OA 
20.89 2.059e-14 130- 
153 


557 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.339e- 
09 846-879 


559 


DM01111 


4 kw PHOSPHATASE 
TRANSFORMING 6 IK PDF1 . 


DM01111L 11.93 3.762e- 
09 7-35 


562 




Poly-adenylate binding 
protein, unique domain 
proteins . 


PF00658C 16.33 9.455e- 
32 118-155 


564 


~BL00141 


Eukaryotic and viral 
aspartyl proteases 
proteins . 


BL00141A 12.10 4.150e- 
10 472-488 


566 


PF00855 


PWWp domain proteins. 


PF00855 13.75 5.667e- 
15 272-289 


567 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19,43 4.977e- 
13 229-268 


569 


BL00107 

i 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 5.S00e-15 183- 
199 


570 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 5.S00e-15 183- 
199 


572 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.857e- 
34 454-483 PR00193C 
12.60 2,636e-31 223- 
251 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-19 508- 
537 


573 


PR00193 


1M X Uo X JN HOAVI L.liAJ.JN 

SIGNATURE 


PR00193D 14.36 1.857e- 
34 470-499 PR00193C 
12.60 2.636e-31 239- 
267 PR00193B 11 69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-19 524- 
S53 


575 


BL00752 


XPA protein. 


BL00752B 19.17 9.703e- 
10 885-929 


576 


BL00030 


Eukaryotic RNA-binding 
region RNP-l proteins. 


BL00030A 14.39 7.000e- 
09 276-2S5 


f>77 


BL00116 


DNA polymerase family B 


3L00116A 12.81 5.73 7e- 
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SEQ rD NO: 


ACCESSION 
NO. 


DESCRIPTION 


1 RESULTS* 






proteins . 


13 864-877 BL00116B ' " 
11.82 1.529e-12 952- 
965 


578 
"579 


BL00195 


Glutaredoxin proteins. 


BL00195B 15.31 7,i58e- " 
09 121-141 




PR00019 


LEUCINE -RICH REPEAT T 
SIGNATURE 


PR00019B 11,36 9.000e- 
11 217-231 PR00019B 
11.36 1.350e-09 386- 
400 PR00019A 11.19 
3.333e-09 389-403 
PR00019B 11.36 8.920e- 
09 363-377 


580 


PRO 02 53 


GAMMA-AMINOBUTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 

t 


PR00253A 9.15 2.125e- "" 
25 275-296 PR00253B 
13.47 7.923e-24 301- 
323 PR00253D 16.68 
5.846e-23 444-465 
PR00253C 13.85 2.241e- 
20 335-357 


583 


PR00343 


SELECTIN SUPERFAMILY 
COMPLEMENT- BINDING 
REPEAT SIGNATURE 


PR00343C 16.85 2.286e- 
11 1233-1252 PR00343C 
16.85 5.500e-ll 333- 
352 PR00343C 16.85 
5.500e-ll 783-802 
PR00343C 16.85 4.246e- 
10 1491-1510 PR00343C 
16.85 8.230e-10 1686- 
1705 


584 


DM01537 


kw SKI2W SKI2 NUCLEOLAR 
HELICASE. 


DM01537B 21.63 1.878e- 
37 79-126 DM01537B 
21.63 9.491e-30 916- 
963 DM01537A 15.14 
3.136e-ll 784-804 


586 


PFC0013 


KH domain proteins 
family of RNA binding 


PF00013 5.78 1.450e-09 
124-136 


587 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 4.409e- 
13 262-296 


589 


BL004 78 


LIM domain proteins , 


BL00478B 14.79 1.643e- 
13 261-276 BL00478B 
14.79 7.709e-09 321- 
336 


590 
"591 


PF00855 


PWWP domain proteins. 


PF00855 13.75 8.000e- 
15 931-948 




F K V V O 99 


PWWP domain proteins . 


PF0D855 13.75 8.000e- 
15 1062-1079 


593 




A^HD- linger . 


PF00628 15.84 3.455e- 
12 424-439 


594 


prod on«; 


CADHERIN SIGNATURE 


PRO0205B 11.39 2.241e- 
16 558-576 PR00205A 
14 .73 9.308e-13 542- 
558 PR00205C 13.65 
5.304e-12 594-609 
PR00205B 11.39 4.273e- 
10 336-354 


595 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.789e- 
18 307-338 


598 


PD01675 


GLYCOPROTEIN MAJOR 
ENVELOPE PROBABLE U3 . 


**uuib/oC 19.89 2.330e- 
10 55-39 


600 


BL00242 


Integrins alpha chain 
proteins. 


BL0D242E 9.03 9.591e- 
27 985-1014 BL00242C 
16.86 4.1l5e-26 286- 
316 BL00242D 13.57 
4,l5*0e-25 357-382 
8L00242B 8.13 7.353e- 
12 189-199 BL0O242D 
L3.57 3.455e-H 421- 
146 BL00242A 13.80 | 
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ACCESSION 
NO. 



DESCRIPTION 



RESULTS* 



5.000e-ll 61-73 
BL00242D 13.57 4.986e- 
10 291-316 



602 



PRO 03 20 
PR00278 



G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 



PANCREATIC HORMONE 

SIGNATURE 

Phorbol esters / 



PR00320A 16.74 5.610e- 
09 198-213 



PR00278A 12.43 4.56"9e- 
10 331-348 



BL.00479 



604 



605 



BIi00315 
BI,00415 



606 



PR00926 



608 



PF00855 



609 



PF00855 



612 



DM01206 



616 



PD02699 



PRO0380 



diacylglycerol binding 
domain proteins. 



BL00479C 12.01 3.250e- 
12 170-183 



Dehydrins proteins. 
Synapsins proteins. 



BL00315A 9. 35 1.672e- 
09 424-452 



BL00415N 4.29 9.794e- 
10 295-339 



MITOCHONDRIAIi CARRIER 
PROTEIN SIGNATURE 



PWWP domain proteins. 



PWWP domain proteins. 



CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 



PROTEIN DWA-BINDING 
BINDING DNA. 



KINESIN HEAVY CHAIN 
SIGNATURE 



PR00926F 17.75 l.OOOe- 
13 335-3S8 



PF00855 13.75 5.167e- 
15 265-282 



PF00855 13.75 5.167e- 
15 211-228 



DM01206B 10.69 7.411e- 
10 877-897 DM01206B 
10.69 8.027e-10 861- 
881 DM01206B 10.69 
9.137e-10 873-893 
DM01206B 10:69 1.456e- 
09 859-879 DM01206B 
10.69 1.797e-09 879- 
899 DM01206B 10.69 
4.076e-09 865-885 
DM01206B 10.69 7.038e- 
09 898-918 DM01206B 
10.69 7.949e-09 871- 
891 DM01206B 10.69 
8.29le-09 767-787 



PD02699A 8.91 2.023e- 
28 129-158 PD02699C 
24.84 1.000e-27 317- 
364 PD02699B 18.28 
I.000e-17 158-182 



PR00380A 14.18 4.086e- 
22 288-310 PR003B0D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976e-13 436- 
455 • 



PR00380 



618 



DM01206 



621" 



PR00700 



KINESIN HEAVY CHAIN 
SIGNATURE 



CORONAVIRUS NUCLEOCAPSID" 



PROTEIN. 



PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 



PR00380A 14.18 4.086e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2,976e-13 436- 
455 



DM012C6B 10.69 5.143e- 
12 531-551 DMO1206B 
10.69 2.603e-10 53S- 
555 



PR00700B 16.80 3.160e- 
21 561-582 



BL00239 



623 
624 



PR00407 



BL00641 



Receptor tyrosine kinase 
class II proteins. 



EUKARYOTIC MOLYBDO PTERIN 
DOMAIN SIGNATURE 



BL00239F 28.15 3.222e- 
10 647-692 BL00239C 
18.75 8.304e-10 543- 
566 



Respiratory- chain nadh 
dehydrogenase 75 Kd 



PR00407K 9,94 8.448e- 
09 326-339 



BL00641C 21.10 l.OOOe- 
40 157-202 BL00641E 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


""627 




subunit proteins. 


24.37 l.OOOe-40 255- 
308 BL00641F 33.12 
1.000e-40 571-623 
BL00641A 17.15 l.aiSe- 
37 48-80 BL00641B 
12.62 5.846e-34 113- 
139 BL00641D 13.23 
9.308e-29 216-240 


£30 


*>R00103 


CAMP - DEPENDENT PROTEIN 
KINASE SIGNATURE 


PR00103E 17.80 2.SO0e- 
18 367-380 PR00103B 
13.39 2.080e-l4 297- 
312 PR00103A 9.59 
2.957e-14 282-297 
PR00103D 10.83 3.077e- 
12 346-358 PR00103C 
15.68 1.000e-ll 334- 
344 PR00103B 13 .39 
1.450e-ll 175-190 
PR00103A 9.59 1.720e- 
10 160-175 


631 ' ■ 


PR00081 


GLUCOSE/RIB ITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081A 10.53 6.2lle- 
16 4-22 




PF00551 


BTB (also known as BR- 
C/Ttk> domain proteins. 


PF00651 15.00 8.500e- 
14 37-50 


635 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 2.233e- 
10 1324-1344 DM01206B 
10.69 4.822e-l0 1276- 
1296 DM01206B 10.69 
7.658e-10 1328-1348 
DM01206B 10.69 8.274e- 
10 1280-1300 DM01206B 
10.69 4.532e-09 1320- 
1340 DM01206B 10.69 
7.266e-09 1326-1346 




HL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 7.600e- 
23 145-176 BL00107B 
13.31 2.636e-13 211- 
227 


635 


BL00657 


Fork head domain 
proteins. 


BL00657A 19.39 1.545e- 
30 101-143 BL00657B 
22.27 7.750e-26 149- 
192 


637 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 l.OOOe- 
10 607-623 


643 
647 


BL00018 


EF-hand calcium- binding 
domain proteins. 


BL00018 7.41 4.913e-09~ 
199-212 




PF0062 8 


PHD- finger. 


PF00628 15.84 2.350e- 
13 385-400 PF00628 
15.84 3.455e-12 464- 
479 


648 
649 




Hypothetical 
yabO/yceC/sfhB family 
proteins . 


BIi01129E 13.25 4.000e- 
25 332-357 BL01129C 
25.56 8.200e-23 236- 
279 BL01129B 12.51 
6.118e-13 191-212 




BJJ01228 


wypotnetical cof family 
proteins. 


BL01228D 17.44 3.908e- 


650 
651 


BL0O627 


1 Homeobox 1 domain 
proteins. 


BL00027 26.43 6.684e- " 
13 771-814 




BL50002 


Src homology 3 (SH3) j 
domain proteins profile. 


BL50002A 14.19 1.750e- 
L2 1026-1045 


653 


fK00253 < 

I 


JAMMA-AMINOBUTYRIC ACID ] 
(6ABA) RECEPTOR : 
SIGNATURE 

I 


PR00253A 9.15 4.000e- 
24 253-274 PR00253C 
L3.85 8.800e-24 313- 
J3S PR002S3B 13.47 
i.l43e-22 279-301 
>R00253D 16.68 7.652e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








20 422-443 


654 


PD01719 


| PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 4.452e- 
11 969-997 PD01719A 
12.89 3.961e-10 128- 
156 PD01719A 12.89 
7.39Se-10 1276-1304 
PD01719A 12.89 1.222e- 
09 1220-1248 


657 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 8.397e- 
09 563-578 


658 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 8.397e- 
09 580-595 


653 


DM0 0215 


PROLINE -RICH PROTEIN 3 . 


DM00^15 19 43 2 174e- 
13 539-572 DM00215 
19.43 4.750e-12 549- 

582 DM00215 19.43 
9.824e-ll 551-584 
DM00215 19.43 2.92 9e- 
10 548-581 DM00215 
19.43 4.054e-lC 550- 

583 DM00215 19.43 
5.339e-10 552-585 
DM00215 19.43 7.107e- 
10 544-577 


660 


PR006B8 


XYLOSE ISOMERASE 
SIGNATURE 


PRO0688I 13.78 9.518e- 
09 224-236 


661 


BXj00027 


' H ome obox ' doma A n 
proteins. 


BL00027 26.43 S.950e- 
23 249-292 


662 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


663 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 


664 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 59S-610 


666 


PR0O819 


SIGNATURE 


10 704-720 


667 


BL50040 


Elongation factor 1 
gamma chain profile. 


BL50040C 22.62 2-143e- 
16 135-178 


668 


PR00019 


LEUCIHE-RTCH JJFPP1T 
SIGNATURE 


09 139-153 PRO0O19A 
11.19 1.667e-09 94-108 
PR00019B 11.36 4.600e- 
09 163-177 


670 


BL00018 


EF-hand calcium- binding 
domain proteins. 


BL00018 7.41 3.250e-10 
S81-694 BL00018 7.41 
6.400e-10 717-730 


672 


PD00131 


ATP -BINDING TRANSPORT 
TRANSMEMBR. 


PD00131B 34.97 l.OOOe- 
34 356-410 PD00131C 
19.59 1.346e-26 504- 
542 


673 


PR00667 


RETINAL PIGMENT 
EPITHELIUM-RETINAL GPCR 
SIGNATURE 


PRG0667G 15.33 7.557e- 
10 106-123 


674 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320A 16.74 4.B57e- 
13 593-608 PR00320B 
12.19 4.115e-12 635- 
650 PR00320C 13.01 
8.435e-ll 717-732 
PR00320C 13.01 2.800e- 
10 635-650 PR00320C 
13.01 6.400e-10 593- 
608 PR00320B 12.19 
3.250e-09 593-608 


675 


PR0U320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 572-587 PR00320B 
12.19 4.115e-12 614- 



213 



WO 01/53312 PCT/US00/34263 



SEO ID NO ■ 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








629 PR00320C 13 .01 
8.435e-ll 696-711 
PR00320C 13.01 2.800e- 
10 614-629 PR00320C 
13.01 6.400e-10 572- 
587 PR0032OB 12.19 
3.250e-09 572-587 


676 


PRO 00 19 


LEUCINE-RICH REPEAT 


PR00019A 11.19 9.667e- 
09 249-263 


679 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type [and similar) . 


PF00642 11.59 3.700e- 
16 225-236 PP00642 
11.59 7.900e-12 187- 
198 


680 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 8.754e- 
10 286-296 


681 


BL00019 


Actinin-type actin- 
binding domain proteins. 


BI»00019D 15.33 4.206e- 
19 227-257 


"682 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 4.000e- 
09 99-118 


687 


PR0O049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.500e- 
10 538-553 


689 


BL01024 


Protein phosphatase 2A 
regulatory subunit PR55 
proteins . 


BL01024A 10.26 l.OOOe- 
40 22-69 BL01024B 
8.91 1.000e-4O 86-127 
BL01024C 7.80 l.OOOe- 
40 146-185 BL01024D 
13.22 l ( 000e-40 185- 
222 BL01024E 11.96 

I. 0D0e-40 222-266 
BL01024F 9.42 l.OOOe- 
40 266-317 BL01024G 

II. 09 1.000e-40 317- 
349 BL01024H 13.88 
l.OOOe-40 389-442 


691 




1 Horaeob ox ' doma 1 n 
proteins. 


BL00027 26.43 8.071e- 
31 152-195 


697. 


HJj00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.05Oe- 
09 45-57 


693 


BL00211 


ABC transporters family 
proteins. 


BL00211A 12.23 5.050e- 
09 45-57 


694 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 58-70 


696 ~ 


BL00680 


Methionine 

aminopeptidase subfamily 
1 proteins. 


BL0O68O 14.37 5.304e- 
17 173-195 


697 


BL00741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 3.418e- 
11 242-265 


698 


DM01930 


2 kw FINGER SMCX SMCY 
YDR096W. 


DM0193OE 15.41 1.367e- 
37 170-215 DM01930F 
14 .16 8.232e-2B 267- 
303 DM01930B 19.86 
9.163e-10 37-71 


700 


PR00869 


DNA-POLYMERASE FAMILY X 
SIGNATURE 


PR00869A 12.80 1.281e- 
16 245-263 


701 


PR0OO48 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.174e- 
10 77-91 PR00048A 
10.52 b.870e-10 133- 
147 PR00048A 10.52 
8.826e-10 105-119 
PR00048A 10.52 5.320e- 
09 161-175 


702 


BL00S23 


Sulfatases proteins. 


BL00523E 19.27 2.S65e- " 
25 326-356 BL00523A 
13.36 5.050e-16 38-55 
BL00523B 8.64 5.909e- 
15 86-98 BL00523C 
12.64 5.500e-13 137- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








148 BL00523D 9.89 
1.844e-ll 290-302 
BLD0523G 9. 45 5.500e- 
10 513-523 BL00523F 
10.85 6.351e-09 413- 
424 


703 


PRO 004 8 


C2H2-TYFE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 8.412e- 
12 376-390 PR00048B 
6.02 l,000e-10 334-344 
PR00048B 6.02 1.474e- 
09 364-374 


707 


PD00787 


SYNTHASE BIOSYNTHESIS 
TRANSFERASE . 


PD007Q7A 14.84 8,941e- 
14 66-82 


708 


PR00761 


BINDIN PRECURSOR 
SIGNATURE 


PR00761E 14.32 8.500e- " 
10 822-841 


712 


DM01354 


kw TRANSCRIPTASE REVERSE 
II 0RF2. 


DM01354Y 10.69 4.977e- 
38 425-465 DM01354X 
13. B6 7.300e-34 376- 
415 DM01354V 12.97 
4.923e-17 311-358 
DM013 54W 12.64 5.596e- 
10 356-376 


713 


BL00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 7.54Se- 
27 450-496 BL00039A 
18.44 2.537e-18 147- 
186 BL00039C 15.63 
2.216e-14 280-304 
BL00039B 19.19 1.947e- 
13 194-220 


715 


BL003S3 


Tyrosine specific 
protein phosphatases 
proteins . 


BL00383E 10.35 4.981e- 
10 150-161 


717 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 4.035e- 
21 106-161 


718 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 3. 750©^ 
39 20-68 DM00031B 
15.41 2,688e-28 84-118 
DM00031C 12.79 1.300e- 
12 131-142 


719 


BL00243 


Integrins beta chain 
cysteine- rich domain 
proteins. 


BL00243B 17.54 l.OOOe- 
40 131-172 BL00243C 
16.42 1.000e-40 172- 
208 BL00243D 24.07 
l.OOOe-40 222-274 
BL00243F 22.63 l.OOOe- 
40 314-358 BL00243I 
31.77 6.571e-39 607- 
650 BL00243E 16.70 
3.077e-35 274-304 . 
BL00243G 21.38 3.625e- 
34 358-400 BL00243H 
17.53 5.235e-29 567- 
593 BL00243A 17.61 
3.250e-21 63-84 
BL00243H 17.53 7.167e- 
16 477-503 BL00243H 
17.53 2.304e-ll 524- 
550 BL00243H 17.53 
5.304e-ll 606-632 
BL00243I 31.77 1.380e- 
09 610-653 


720 


PR00217 


43 KD POSTSYNAPTIC 
PROTEIN SIGNATURE 


PRG0217C 10.91 8.022e- 
09 20-36 


722 


PR00704 


CALPAIN CYSTEINE 
PROTEASE <C2) FAMILY 
SIGNATURE 


PR00704D 11.05 5.909e- 
34 135-161 PR00704F 
13.61 7.O00e-26 190- 
218 PR00704E 12.55 
8.071e-26 165-189 
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SEQ ID NO: 


NO. 


u Jib uK X 1?T I ON 


RESULTS* 








PR00704B 17.94 2.241e- 
23 75-98 PR00704A 
14.68 4.094e-19 30-54 
PR00704C 11.88 1.871e- 
18 99-116 


725 


PR001S4 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-187 


726 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-187 


72 7 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 2.l25e-"" 
13 277-292 PR00320A 
16-74 1.310e-ll 277- 
292 PR00320C 13.01 
4.522e-ll 323-338 
PR00320A 16.74 6.586e- 
11 323-338 PR00320B 
12.19 4.343e-10 323- 
338 PR00320B 12.19 
6.914e-10 277-292 


731 
"733 


PR00195 


DYNAMIN SIGNATURE 


PR00195A 11.94 8.627e- 
16 288-307 PR00195E 
9.82 3.912e-ll 457-474 




PF00642 


Zinc finger C~x8-C-xS-C- 
x3-H type (and similar) . 


PF00642 11.59 9.082e- 
10 787-798 


73 8 


BL.00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins . 


BL00039A 18.44 2.565e- 
28 26-65 BL00039D 
21.67 2 .105e-20 338- 
384 BL00039C 15.63 
9.100e-13 160-184 
BL00039B 19.19 9.617e- 
11 73-99 


739 


d±jU a y 


TSC-22 / dip / bun 
family proteins. 


BL01289A 12.18 8.909e- 
31 326-353 BL01289B ^ 
10.45 9.57le-17 353- 
3 83 


742 


BL01019 


ADP-ribosyiatlon factors 
family proteins. 


BL01019A 13.20 7.07Be- 
12 41-81 


743 


BL00965 


Phosphomannose isomerase 
type I proteins. 


BL00965C 23.78 l.OOOe- 
40 256-305 BL00965B 
17.77 1.600e-25 126- 
153 BL00965A 10.57 
6.400e-19 94-113 


747 




Kringle domain proteins. 


BL00021D 24.56 4.563e- 
25 231-273 BL00021B 
13.33 5.345e-2l 60-7B 


748 


BJL.00^12 


Osteonectin domain 
proteins . 


BL00612B 11.35 2.034e- 
11 93-126 


749 


PR00450 


RE COVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 6.880e- 
10 135-157 


752 


BIi00795 


involucrin proteins. 


BL00795C 17.06 6.000e- ' 
11 384-429 BL00795C 
17.06 9.444e-ll 370- 
415 


754 


BL00051 


Ribosomal protein L3 9e 


BL00051 20.92 1.935e- 
16 4-50 


755 


DM01970 


0 Jew ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 7.723e- 
09 171-184 


760 


BL01020 


SARI family proteins. 


BL01020C IS 3<? <i ~h*>n*»_ 
12 99-150 


762 


3L00046 


His tone H2A proteins . 


BL00046 12". 95" l.OOOe- 
40 33-88 


7^3 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 9.137e- 
10 206-240 


764 


aLooo27 


'Homeobox' domain 
proteins. 


BL00027 26.43 8.800e- 
29 417-460 


767 


BL01208 


VWFC domain proteins. 


SL01208B 15.83 6-063e- 
10 309-324 BL01208B 
L5.83 8.03le-10 165- 
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SEQ ID NO: 


ACCESSION 
. NO. 


DESCRIPTION 


RESULTS* 








180 BIi01208B 15.83 
4.l62e-09 85-100 


770 


BLO0031 


Nuclear hormones 
receptors DNA-binding 
region proteins. 


BL00031A 19.55 9.571e- 
32 -208-241 BL00031B 
22.25 5.500e-27 242- 
274 


772 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.450e- 
18 4-26 PR00449E 
13.50 3.520e-14 142- 
165 PR00443C 17.27 
3.032e-13 44-67 
PR00449D 10.79 8.579e- 
13 107-121 PR00449B 
14.34 3.455e-ll 27-44 


773 


BL0O523 


Sulfatases proteins. 


BL00523E 19.27 9.333e- 
23 299-329 BL00523A 
13.36 2.200e-13 47-64 
BL00523B 8.64 2.60/e- 
13 91-103 BT.00523D 
9.89 7.923e-12 224-236 
BLC0523C 12.64 4,512e- 
10 141-152 BL00523F 
10.85 5.821e-10 373- 
384 


775 


BL0O028 


Zinc finger, C2H2 type, 
domain proteins. 


3L00028 16.07 7.686e- 
09 568-585 


776 


BL0O028 


Zinc finger , C2H2 type, 
domain proteins . 


BL00028 16.07 7.686e- 
09 621-638 


111 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.686e- 
09 595-612 


778 


BL0O030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 8.412e- 
11 322-341 BL00030A 
14.39 7.000e-10 220- 
239 


779 


PRO 0079 


GU7C0SE- 6 - PHOSPHATE 
DEHYDROGENASE SIGNATURE 


PRD0079B 12.98 2.929e- 
26 193-222 PR00079E 
16.65 4.150e-23 348- 
375 PR00079C 8.68 
6.351e-16 246-264 
PR00079D 13.51 7.070e- 
16 264-281 PR00079A 
16.12 6.769e-13 169- 
183 


781 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 9.250e- 
17 10-35 BL00215A 
15,82 6.000e-16 221- 
246 BL00215A 15.82 
7.857e-12 108-133 
BL00215B 10.44 9.526e- 
11 168-181 


783 


PD00209 


PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 


PD00289 9.97 6.276e-09 
159-173 


785 


BL00690 


DEAH-box subfamily ATP- 
dependent helicases 
proteins . 


BL00690B 13.38 l.OOOe- 
12 147-165 BL00690A 
6.87 5.320e-10 114-124 
BL00690C 7.51 3.189e- 
09 218-228 


786 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 8.500e- 
16 50-73 PR00449A 
13 .20 5.235e-14 8-30 
PR00449E 13.50 2.853e- 
11 150-173 PR00449D 
10.79 l.S45e-09 lll- 
125 


78 8 


DM01206 


CORONAVTRUS NUCLEOCAPSlD 
PROTEIN. 


DM01206B 10.69 8.767e- 
10 1-21 


790 


B.LO0915 


Phosphatidyl inositol 3- 
and 4 -kinases proteins. 


BL00915C 22.43 9.182e- 
39 725-764 BL00915B 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








22.78 S.050e-33 633- 
671 BL00915D 27.02 
1.529e-21 795-831 
BL00915A 10.09 l.OOOe- 
13 395-407 


791 


PR00208 

/ 


GLIADIN AND LMW GLUTEN* N 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 6.294e- 
10 120-138 PR00208A 
12. S9 6.294e-10 121- 
139 PRO02O8A 12.59 
6.294e-10 122-140 
PR00208A 12.59 6.294e- 
10 123-141 PR002O8A 
12.59 6.294e-10 124- 
142 PR00208A 12.59 
6.294e-10 125-143 
PR00208A 12.59 6.294e- 
10 126-144 PR00208A 
12.59 6.294e-10 127- 
145 PR00208A 12.59 
6.294e-10 128-146 
PR00208A 12.59 6.294e- 
10 129-147 PR002O8A 
12.59 7.411e-09 130- 
148 PR00208A 12.59 
7.658e-09 131-149 
PR00208A 12.59 7.904e- 
09 132-150 PR002O8A 
12.59 fl.274e-09 118- 
136 PR00208A 12.59 
8.274e-09 119-137 


795 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 5.034e- 
16 302-320 PR00205A 
14.73 1.257e~ll 284- 
300 PR00205C 13.65 
1.333e-ll 337-352 


796 


BL00412 


Neuromodul in ( GAP - 43 ) 
proteins . 


BL0O412D 16.54 4.00Oc 
12 196-247 BL00412D 
16.54 5.705e-ll 197- 
248 BL00412D 16.54 
7.848e-10 199-250 
BL00412D 16.54 1.827e- 
09 195-246 BL00412D 
16.54 1.918e-09 194- 
245 BL00412D 16.54 
2.102e-09 201-252 




797 


BL00021 


Kringle domain proteins. 


BL00021B 13.33 6.339e- 
13 40-58 




799 


BL01052 


Calponin family repeat 
proteins . 


BL01052C 18.51 1 . OOOe- 
40 87-127 BL01052A 
16.12 1.529e-32 3-35 
BL01052B 15.31 l,257e- 
25 52-78 BL01052D 
10.26 5.737e-2S 174- 
194 




800 


BL0D348 


p53 tumor antigen 
proteins. 


BL00348F 23.19 3 . 714e- 
09 197-240 




801 


BLO03O9 


Vertebrate galactoside- 

bindino 1 ppH-j n nrnl'A'iriit 


BL00309C 18.65 1.621e- 

U j Da-o ( 




802 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245D 10.47 5.224e- 
09 187-199 




804 


PF00774 


Dihydropyridine 
sensitive L-type calcium 
channel (Beta subuni. 


PF00774A 16.47 8.457e- 
10 110-156 




808 


PR00667 


RETINAL PIGMENT 
EPITHELIUM -RETINAL GPCR 
SIGNATURE 


PR00667C 11.71 9.875e- 
09 12-28 




810 


PD02346 


PHOTOS YSTEM II PROTEIN 
PRECURSOR 


PD02346F 12.89 4,340e- 
09 317-354 J 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


" RESULTS* 






PHOTOSYNTHESIS. 




BX1 


BL00685 


CBF-A/NF-YB subunit 
proteins. 


BL00685B 14.41 6.779e- 
14 54-95 BL00685A 
11.22 4.798e-13 5-54 


812 


PR00080 


ALCOHOL DEHYDROGENASE 
SUPERFAMILY SIGNATURE 


PR00080A 9.32 9.419e- 
10 93-105 


813 


BL00357 


Histone H2B proteins. 


BL003S7 7.74 1.988e-17 
22-65 


815 


PD00066 


PROTEIN ZINC-FINGER 
METAL -BIND I . 


PD00066 13.92 7.923e- 
15 158-171 PD00066 
13.92 5.200e-14 46-59 
PD00066 13.92 7.000e- 
14 18-31 PD00066 
13.92 7.000e-l3 130- 
143 PD00066 13.92 
7.500e-13 214-227 
PD00066 13.92 9.000e- 
13 102-115 PD00066 
13.92 4.429e-l2 186- 
199 PD00066 13.92 
1.783e-ll 74-87 


816 


BL01195 


Peptidyl-tRNA hydrolase 
proteins. 


BL01195C 20.12 3.348e- 
20 100-139 


820 


BLC0520 


Interleiikin-10 family 
proteins. 


BL00520A 6.21 6.471e- 
09 1-14 


822 


BL00972 


Ubiquitin carboxyl- 
terrainal hydrolases 
family 2 proteins. 


BL00972A 11.93 8.113e- 
09 224-242 


825 


PR00876 


NEMATODE METALLOTH I ONE I N 
SIGNATURE 


PR00876B 7.66 2.268e- 
10 101-115 


823 


PD02855 


FLAVOPROTEIN PROTElNf 
DNA/PANTOTHEN . 


PD02855A 18.37 4.732c- 
28 88-124 PD02855B 
8.36 6.478e-09 132-142 


830 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 7.000e- 
21 44-62 PR00405C 
19.41 1.000e-13 65-87 
PR0O4O5A 17.71 7.283e- 
13 25-45 


831 


PRO 0019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019A 11.19 l.OOOe- 
09 47-61 PR00019B 
11.36 1.720e-09 136- 
150 PR0OO19B 11.36 
3.880e-09 44-58 


832 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllB 13.08 3.438e- 
16 164-183 PROOOllD 
14.03 6.850e-l6 164- 
183 PROOOllA 14.06 
8.364e-14 164-183 
PROOOllC 24.25 5.415e- 
12 231-260 PROOOllD 
14.03 9.852e-ll 212- 
231 


834 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 232-246 


835 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 4.000e- 
10 290-304 


836 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 216-230 


837 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 3.898e- 
09 78-111 


839 


PD02784 


PROTEIN NUCLEAR 
RIBONUCLEOPROTEIN . 


PD02784B 26,46 8.302e- 
09 73-116 


840 


PRO 0700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 5.091e- 
22 369-390 PR00700D 
12.47 5.765e-21 491- 
510 PR00700C 13.17 
4.750e-14 449-467 
PR00700F 11.18 8.5O0e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








11 538-549 PR00700E 
17.57 3.100e-10 522- 
538 


841 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 5.404e- 
13 134-153 


844 


PDC2785 


PROTEIN RIBOSOMAL 60S 
L22 RNA-BINDING HEP. 


PD02785B 14.43 l.OOOe- 
40 58-112 PD02785A 
15.23 1.915e-28 8-57 


845 


BLC0826 


MARCKS family proteins. 


BL00826C 7.63 6.738e- 
09 203-230 


846 


BL00518 


Zinc finger, C3HC4 type 
(RING finger) , proteins. 


BL0051B 12.23 4.429e- 
10 15-24 


849 


BL00518 


Zinc finger, C3HC4 type 
(RING finger) , proteins . 


BL00518 12.23 l.OOOe- 
08 340-349 


850 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 6.506e- 
09 12-27 


851 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 7.000e- 
16 246-280 


852 


BL00420 


Speract receptor repeat 
proteins domain 
proteins. 


BL00420B 22.67 l.OOOe- 
40 723-778 BL00420B 
22.67 1.321e-38 933- 
988 BL00420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 587-642 BL00420B 
22.67 9.625e-27 270- 
325 3L00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-15 830-885 
BL00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.900e-12 808- 
819 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 5.119e-ll 1018- 
1029 BL00420C 11.90 
7.955e-10 567-578 * 


853 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420B 22.67 l.OOOe- 
40 756-811 BL00420B 
22.67 1.321e-38 966- 
1021 BLOO420B 22.67 
S.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 620-675 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.20Se-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-15 863-918 
BL00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 l.900e-12 841- 
852 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.83le- 
11 141-152 BL00420C 
11.90 5.119e-ll 1051- 
1062 BL00420C 11.90 



220 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 


ACCESSION 
WO. 


DESCRIPTION 


RESULTS* 








7.955e-10 567-578 


857 


PR00388 


3* ,5' -CYCLIC NUCLEOTIDE 
CLASS II 

PHOSPHODIESTERASE 
SIGNATURE 


PR00388A 10.45 2.778e- 
09 64-83 


859 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 2.92Se- 
13 37-56 BL00030B 
7.03 1.900e-ll 167-177 
BL00O3 0A 14.3 9 2.000e- 
10 128-147 


861 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.250e- 
17 23-41 PR00988C 
13.64 8.714e-16 107- 
123 PR00988P 12.23 
7.828e-15 198-212 
PR00988E 8.27 9.769e- 
12 176-188 PR00988D 
5.95 8.2S0e-ll 163-174 
PR00988B 11.60 4 . 512e- 
10 60-72 


863 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215B 10.44 8.071e- 
12 41-54 


864 


PR00775 


90 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00775E 8.06 l.OOOe- 
24 198-221 PR00775B 
3.52 1.837e-23 107-130 
PR00775D 8.91 4.484e- 
17 171-189 PR00775A 
9.90 B.342e-17 86-107 
PR00775C 10.68 9.379e- 
17 153-171 PR00775G 
10.64 6.850e-15 267- 
286 PR00775F 12.76 
6.769e-14 249-267 


866 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688G 16.45 9.460e- 
09 B9-121 


867 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL - 
BINDING NU. 


PD01066 19.43 5.596e- 
29 14-53 


863 . 


BL01287 


RNA 3 ' -terminal 
phosphate cyclase 
proteins . 


BL01287A 17.95 2.688e- 
26 16-48 


869 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.464e- 
10 304-337 


872 


BL.0OD4 6 


Histone H2A proteins. 


BLOO046 12.95 l.OOOe- 
40 30-85 


874 


DT.rtftl Q 6 
CLlU UlDD 


Biotin-reguiring enzymes 
attachment site 
proteins . 


BLO0188 30.29 9.036e- 
32 665-711 


876 


oljxj yj *s o 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.686e- " 
09 298-315 


877 




bUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 


PD021O2A 16.74 4.176e- 
10 97-141 






HYDROL . 




879 


BL01189 


Ribosomal protein S12e 
proteins. 


BL01189A 14.27 l.OOOe- 
40 35-71 BL01189B 
13.49 1.000e-40 71-125 


882 


ai'.nn? Q A. 

IjJ_»U \J A O Ht 


Serpins proteins. 


BL00284C 28.56 6.400e- 
17.99 6.182e-12 35-56 


889 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.375e- 
21 35-85 


896 


PR0G391 


PHOSPHATIDYL INOSITOL 
TRANSFER PROTEIN 
SIGNATURE 


PR00391E 12.50 7,785e- 
15 211-231 PR00391B 
8.39 1.000e-13 83-104 
PR00391D 12.21 9.328e- 
13 191-207 PRO0391A 
7.B3 5.390e-ll 16-36 | 


897 


PROQ327 1 


ICE NUCLEATION PROTEIN 


PR00327C 6.37 5.247e- | 
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SEQ 10 NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


09 313-328 


898 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 7.800e- 
26 386-432 BL00039A 
18.44 6.674e-16 113- 
152 BL00039B 19.19 
1.947e-13 153-179 
BL00039C 15.63 9.460e- 
11 236-260 


901 


PD00066 


PROTEIN ZINC-FINGER 
METAL-BINDI . 


PD00066 13.92 8.2O0e- 
16 254-267 PD00066 
13.92 8.200e-I6 282- 
295 PD00066 13.92 
8.2O0e-l6 310-323 
PD00066 13.92 6.2O0e- 
16 366-379 PD00066 
13.92 8.200e-16 394- 
407 PD00066 13.92 
8.200e-14 338-351 


902 


BL0111S 


GTP- binding nuclear 
protein ran proteins. 


BL01115A 10.22 9.32le- 
11 6-50 


903 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 9.1S0e- 
09 97-111 


904 


PR00381 


KINESIN LIGHT CHAIN 
SIGNATURE 


PR00381E 8.75 6.586e- 
25 335-356 PR00381B 
18.17 2.667e-24 204- 
224 PR00381A 9.55 
2.800e-24 107-125 
PR00381C 12.48 4.522e- 
24 226-245 PR00381D 
13.94 1.084e-22 291- 
309 PR00381F 9.13 
3.288e-22 370-392 
PR00381F 9.13 7.18le- 
13 286-308 PR00381B 
8.75 4.066e-ll 251-272 
PR00381E 8.75 7.033e- 
11 293-314 PR00381E 
8.75 8.364e-10 377-398 
PR00381D 13.94 5.230e- 
09 333-351 PR00381C 
12.48 7.120e-09 310- 
329 


906 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4.54 8.557e- 
09 525-549 


907 


PRC 03 45 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4. 54 8.557e- 
09 513-537 


908 


Bt06678 


Trp-Asp (WD) repeat 
proteins protein© . 


BL00678 9.67 9.308e-ll 
144-155 


910 


PD01O66 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.800e- 
30 48-87 


912 


BL01104 


Ribosomal protein L13e 
proteins . 


BL011O4C 15.14 6.000e- 
09 364-392 


922 


3L00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 3.842e-09 
500-511 


923 


PR0O320 


O- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR0032OC 13.01 2.500e- 
09 323-338 PR00320C 
13.01 5.500e-09 187- 
202 


924 


PD02181 


PROT0CHL0ROPHYLLIDE 
REDUCTASE PHOTOSYNT. 


PD02181D 12.85 8.609e- 
09 36-64 


926 


BL00 019 


Act inin- type actin- 
binding domain proteins. 


BL00019C 14.66 7.453e- 
25 108-144 BL00019B 
13.34 6.S10C-11 61-84 
BL00019D 15.33 9.338e- 
11 205-235 BL00019A 
12.56 2.373e-l0 34-45 


928 


BL00678 


Trp-Asp (WD) repeat 


BL00678 9.67 9.308e-ll 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins. 


273-284 BL00678 9.67 
1.600e-10 314-325 
BL00678 9.67 7.600e-10 
360-371 BL00678 9.67 
8.579e-09 206-217 


929 


BL00518 


Zinc finger, C3HC4 type 
(RING finger}, proteins. 


BL0051B 12.23 1.857e- 
10 137-146 


930 


BL01085 


Ribu lose -phosphate 3- 
epimerase family 
p rote ins. 


BL01085D 16.55 4.600"e^ 
24 134-165 BL01085B 
10.15 5.680e-22 30-52 
BL01085E 18.87 8.676e- 
20 172-202 BL01085C 
21.81 2.038e-14 66-97 




BL01085 


Ribulose -phosphate 3- 
epimerase family 
proteins. 


BLC1085D 16.55 4.600e- 
24 152-183 BL01085B 
10.15 5.680e-22 30-52 
BL01085E 18.87 8.676e- 
20 190-220 BL01085C 
21.81 2.038e-14 66-97 


933 




PROTEIN REPEAT MUSCLE 
CALCIUM- B I . 


PD00301A 10.24 6.400e- 
09 160-171 


936 


PP00168 


C2 domain proteins. 


PF00168C 27.49 4,000e- 
12 336-362 


937 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.519e- 
10 5-49 


94 0 


PR00862 


PROLYL OLIGOPEPTIDASfi 
SERINE PROTEASE <S9A) 
SIGNATURE 


PR00862D 16.17 4.086e- 
09 63-84 


945 


BL01230 


RNA methyl transferase 
trmA family proteins . 


BL01230B 11.62 2.373e- 
09 407-420 


94 6 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 7.429e- 
18 52-68 BL00479A 
19.86 2.200e-13 26-49 


94 9 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL0O678 9.67 1.474e-09 
100-111 


954 


PD01311 


PROTEIN OXIDOREDUCTASE 
NAD INTERGENIC RE. 


PD01311A 3 0.23 5.909e- 
10 66-111 


955 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.250e- 
12 47-60 


956 


PF006S1 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.250e- 
12 47-60 


957 


BL00379 


CDP- alcohol 

phosphatidyl transferases 
proteins . 


BL00379 24.^4 l.^ibe- 
15 111-148 


959 


oLtvx lib 


GTP -binding nuclear 
protein ran proteins. 


BL01115A 10.22 1.884e- 
10 31-75 


960 


BL01115 


GTP -binding nuclear 
protein ran proteins. 


BL01115A 10.22 3.438e- " 
14 110-154 


962 


BL00061 


Short -chain 

dehydrogenases/reductase 
s family proteins. 


BL00061B 25.79 6.586e- 
13 198-236 


963 


PR00502 


MUTT DOMAIN SIGNATURE 


PR00502A 15.06 8.200e- 
11 210-225 


966 


PRO 03 08 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 7.035e- 
09 55-70 


967 


DM01206 


CORONAVIRUS NUCLE0CAPSID 
PROTEIN. 


DM01206B 10.69 1.286e- 
12 104-124 DM01206B 
10.69 5.299C-11 23-43 
DM01206B 10.69 8.274e- 
10 73-93 DM01206B 
10.69 3.962e-09 108- 
128 DM01206B 10-69 
5.671e-09 38-58 


969 


PF01008 


Initiation factor 2 
subunit . 


PF01008B 25.59 4.724e- 
31 417-460 PF01008C 
12.25 5.333e-l8 506- 
526 PF01008A 20.14 
5.875e-15 369-390 
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ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 


970 


BL01277 


Ribonucleaee PH 
proteins . 


BL01277C 10.18 7.648e- 
10 112-143 BL01277A 
17.39 9.806e-10 40-78 


975 


BL01159 


WW/rspS/wwp domain 
proteins . 


BL01159 13.85 3.605e- 

12 130-145 BL01159 

13 .85 4 ,122e-10 171- 
186 


977 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791C 20.98 2.235e- 
09 55-94 


978 


BL011S7 


Ribosomal protein LI 7 
proteins . 


BL01167B 20.66 8 . 258e- 
19 B8-127 


979 


BL00478 


LIM domain proteins. 


BL00478B 14.79 9.357e- 
13 33-48 BL00478B 
14.79 7.250e-12 98-113 


980 


FR00312 


CALSEQUESTRIN SIGNATURE 


PR00312E 8.32 3.423e- 
36 169-199 PR00312I 
15.78 S.286e-35 332- 
361 PR00312F 15.06 
5.865e-35 199-229 
PR00312H 13.31 8.313e- 
35 263-291 PR00312J 
13.73 5.688e-34 363- 
392 PR00312D 9.43 
2.636e-33 128-158 
PR00312C 15.14 8.839e- 
33 92-122 PR00312B 
15.08 8.941e-33 62-92 
PR00312G 11.11 6.657e- 
32 230-258 PR00312A 
11.70 6.914e-27 35-59 


981 


PF00992 


Troponin . 


PFO0992A 16.67 8.816e- 
09 414-449 


982 


PRO 02 99 


ALPHA. CRYSTALLIN 
SIGNATURE 


PR00299F 13.20 2.367e- 
09 127-149 


983 


BL01150 


Respiratory- chain NADH 
dehydrogenase 20 Kd 
subunit proteins. 


BL01150B 17.16 l.OOOe- 
40 156-202 BL01150A 
14.10 8.200e-39 100- 
138 


986 


BL00795 


Involucrin proteins. 


BL00795C 17.06 7.211e- 
14 4-49 BL00795C 
17.06 1.778e-ll 1-46 
BL00795C 17.06 3.407e- 
10 14-59 BL00795C 
17.06 7.802e-10 2-47 
BL00795C 17.06 8.640e- 
10 19-64 BL00795C 
17.06 7,400e-09 11-56 
BL00795C 17.06 7.800e- 
09 3-48 


987 


3L0093 9 


Ribosomal protein Lie 
proteins . 


BL00939F 17.27 £.393e- 
09 810-840 


988 


PRO0452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- 
11 525-541 


989 


PR00452 


5H3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- 
11 497-513 


994 


HL00027 


'Horaeobosc' domain 


BL00027 26.43 2.500e- 
25 146-189 


997 


BL013 04 


ubiH/COQS monooxygenase 
family proteins. 


BL01304A 8.05 3.893e- 
11 65-79 


998 


DM01767 


5 TRANSMITTER DOMAIN . 


DM01767B 10,07 7.868e- 
09 22-39 


1000 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926C 16.07 1.7S0e- 
24 73-94 PR00926D 
10.53 3.250e-23 126- 
145 PR0O926F 17.75 
6.211e-23 217-240 
PRO0926E 11.70 6.625e- 
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SEQ ID NO: 


ACCESSION" 
NO. 


DESCRIPTION 


RESULTS* 








20 174-193 PR00926B 
16.07 2.125e-18 24-39 
PR00926A 10.41 l.OOCe- 
15 11-25 PR00926F 
17.75 5.565e-09 120- 
143 


1005 


BL00406 


Actins proteins. 


BL00406B 5.47 l.OOOe- 
40 88-143 BL00406C 
6.75 1.000e-40 147-202 
BL00406D 12.58 3.700e- 
40 270-325 BL00406E 
8.44 7.375e-38 327-377 
BL00406A 9.95 3.348e- 
29 11-46 


1006 


BL00406 


Act ins proteins. 


BL00406B £.47 l.OOOe- 
40 88-143 BL00406C 
6.75 1.000e-40 147-202 
BL00406E 8.44 l.OOOe- 
35 248-298 BL00406A 
9.95 3.348e-29 11-46 


100.7 




| TAILLESS COMPLEX 

\ POLYPEPTIDE 1 

1 (CHAPERONE) SIGNATURE 


PR00304D 11.04 8.714e- 
22 384-407 PR003O4C 
8.69 4.667e-20 98-118 
PR0O304B 11.60 7.577e- 
19 68-87 PR00304A 
9.20 3.382e-16 46-63 
PR00304E 7.79 6.870e- 
13 418-431 


1009 


FD01056 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL - 
BINDING NU. 


PD01066 19.43 2.929e- 
32 9-48 


1011 




PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.929e~ 
32 68-107 


1012 




Zinc finger, C3HC4 type 
(RING finger) , proteins . 


BL00518 12.23 6.143e- 
10 64-73 


1016 




SYNTHETASE LIGASE 
PROTEIN ALANYL. 


PD01168H 12.08 l.OOOe- 
11 174-194 


1018 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION . 


PD00930B 33.72 1.391e- 
32 261-302 PD00930A 
25 .62 9.550e-22 157- 
183 




1022 


BL00175 "** 


Phosphocjly cerate mutase 
family phosphohistidine 


BL00175A 15.42 5.179e- 

12 6-26 BL00175C 

23 . 75* 8. 062e-lO 79-111 




1025 


PRQ03 05 


14-3-3 PROTEIN ZETA 
SIGNATURE 


PR00305D 16.34 1.439e- 
10 158-185 




1026 


Bt00353 


HMG1/2 proteins. 


SL00353B 11.47 2.436e- 
18 238-288 BL00353C 
14.83 8.844e-ll 288- 
335 




1028 


BLO 01 83 


UfaiCTLl i t* 1 n-nftTiinrrahi r\rt 

enzymes proteins. 


BL00183 28.97 1.310e- 
33 43-91 




1033 


?F00580 


UvrD/REP helicase . 


PF00580A 13.37 4.720e- 
09 111-133 




1034 


PRO 04 13 


HAT >Q AC ID 

DEHALOG2NAS E/EPOXIDE 
HYDROLASE FAMILY 
SIGNATURE 


PR00413E 15.78 3.429e- 
09 1S4-171 




1037 


PDO1066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PDOI06tT 19.43 9.657e- 
09 5-44 




1038 


PU01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PU01796 15.01 4.259e- 
11 55-82 




1039 


BL00299 


Ubiquitin domain 
proteins . 


BL00299 28.84 9.036e- 
09 17-69 




104 0 


t*R00970 

1 


&RGININE ADP- 

*I BOS YLTRANSFERASE 


PK00970A 17.73 6.143e- 
20 56-78 PR00970D 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


9.96 2.154e-18 154-171 
PR00970F 12.30 l.OOOe- 
16 224-241 PR00970G 

9.97 9. 2296-15 242-258 
PR00970B 16.37 1.290e- 
13 86-105 PR00970C 
11. OS 1.643e-ll 115- 
130 PR00970E 11.23 
9.820e-ll 202-218 


2042 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 2.200e-10 
243-254 


1043 


PR0004 8 


C2H2-TYPE 2 INC FINGER 
SIGNATURE 


PR00048A 10.52 6.786e- 
13 114-128 PR0004BA 
10.52 1.000e-09 172- 
186 


1045 


BL00615 


C-type lectin domain 
proteins . 


BL00615A 16.68 1.720e- 
11 218-236 BL00615B 
12.25 1.857e-10 317- 
331 


1046 


BL01092 


Adenylate cyclases 
class- I proteins. 


8L01092N 13.54 8.924e- 
10 3-40 


1047 


BL01216 


ATP-citrate lyase / 
succinyl-CoA ligases 
family proteins . 


BL01216D 21.75 4.316e- 
28 314-344 BL01216A 
13.91 1.000e-10 97-112 


1043 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 7.618e- 
12 102-136 


1050 


BL01073 


Ribosomal protein L24e 
proteins . 


BL01073 24.30 l.OOOe- 
40 12-62 


1054 


BL0O571 


Amidases proteins. 


BL00571 25.69 5.875e- 
31 160-212 


1055 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 5.235e~ 
11 98-117 BL00030B 
7.03 4.316e-09 137-147 


1058 


BIi00223 


Annexins repeat proteins 
domain proteins . 


BL00223C 24.79 8.754e- 
23 262-317 BL00223A 
15.59 9.478e-14 46-00 
BL00223A IS. 59 5.557e- 
11 118-152 


1060 


BL00027 


' Homeobox 1 domain 
proteins . 


BL00027 26.43 3.455e- 
35 158-201 


1064 


BL00455 


Putative AMP-binding 
domain proteins . 


BL00455 13.31 6.211e- 
13 280-296 


1065 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.000e- 
09 115-129 PR00019B 
11.36 3.880e-09 87-101 


1066 


PR00326 


GTP1/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 4.600e- 
16 151-172 PR00326C 
9.79 1.290e-14 200-216 
PR00326B 16.74 8.548e- 
14 172-191 PR00326D 
19.09 1.257e-13 217- 
236 


1071 


PD0287O 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870B 18.83 8.51Se- 
11 164-197 


1072 


PP00S56 


SET domain proteins. 


PF00856A 26.14 5.976e- 
09 350-387 


1075 


BL01009 


Extracellular proteins 

ov-F/ xpx-l/ Ag5/PR-l/SC7 

proteins . 


BL01009D 14.19 4.300e- 
20 127-148 BL01009A 
13.75 6.586e-13 57-75 
BL01009E 13.50 1.439e- 
11 159-175 


1077 


PR00724 


CARBOXYPEPTIDASE C 
SERINE PROTEASE (S10) 
FAMILY SIGNATURE 


PR00724A 10.91 l.OOOe- 
08 366-379 


'1078 1 ■ 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 l.OOOe- 
12 170-195 BL00215A 
15.82 7.529e-10 79-104 


1079 


BL00678 


Trp-Asp (WD) repeat 


BL00678 9.67 4.316e-09 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins. 


298-309 


1081 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 7.398e- 
10 23-57 


1094 


BIi0O46O 


Glutathione peroxidases 
selenocysteine proteins . 


BL.00460A 28.67 3.204e- 
18 57-92 BL00460B 
9.73 6.400e-13 100-118 
BL00460D 16.89 9.143e- 
12 162-182 BL00460C 
14.35 5.500e-09 133- 
156 


1095 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 67-105 PD02811B 
17.07 2.263e-21 118- 
151 PD02811C 13.25 
5.696e-13 154-167 


1096 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 60-98 PD02811B 
17.07 2.263e-21 111- 
144 PD02811C 13 .25 
5.696e-13 147-160 


1097 


BL00479 


Phorbol esters / 
diacyl glycerol binding 
domain proteins. 


BL00479B 12.57 6.143e- 
09 200-216 


1105 


PF0O881 


Nitroreductase family. 


PF00881A 27.15 9.229e- 
13 111-147 


1109 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.077e- 
10 15-37 PR00449E 
13.50 1.857e-09 185- 
208 PR00449D 10.79 
8.364e-09 131-145 


1115 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5.737e- 
20 42-60 PROC405A 
17.71 2.703e-17 23-43 
PR0040SC 19.41 6.902e- 
10 63-85 


1116 


BL0035S 


HMG14 and HMG17 
proteins . 


BL00355 5.97 2-528e-25 
20-51 


1117 


BL00355 


HMG14 and HMG17 
proteins . 


BL00355 5.97 2.528e-25 
20-51 


1120 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 4.8S7e- 
10 290-306 


1123 


PR00412 


EPOXIDE HYDROLASE 
SIGNATURE 


PR00412F 18.76 9.526e- 
12 301-324 


1125 


PR001B6 


HEMERYTHRIN SIGNATURE 


PR001B6A 13.62 2.800e- 
09 87-101 


1129 


BL0017Q 


Cyclophilin-type 
pep t idyl -prolyl cis- 
trans isomerase 
signatur. 


BL00170C 18.49 3.077e-~ 
33 84-129 BL00170B 
20.97 6.838e-25 37-77 
BL00170A 17.08 3.455e- 
15 10-37 


1131 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 5.304e- 
15 29-46 BL00636B 
15.11 1.360e-14 59-80 


1132 


.Qi-iU Uo/O 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1133 


BL0067B 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1136 




clathrin adaptor 
complexes medium chain 
proteins. 


BL00990C 18.78 4.176e- 
38 235-269 BL00990A 
21.44 4.316e-36 94-132 
BL00990B 20.15 2.125e- 
27 157-187 BL00990D 
16,13 5.320e-lB 403- 
422 


1137 


PR00314 


CLATHRIN COAT ASSEMBLY 
PROTEIN SIGNATURE 


PR00314B 15.68 8.000e- 
34 100-128 PR00314D 
9.66 3.531e-33 233-261 
PR00314C 16.05 8.909e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








32 159-188 PR00314A 
14.53 1.28le-22 13-34 


1139 


BL01115 


GTP- binding nuclear 
protein ran proteins. 


SUVA Jm -*-\J * A* & O « «J O^l £ 

13 13-57 


1141 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BLO0107A 18.39 4 . OOGe- 
19 451-4B2 BL00107B 

535 


1148 


PR00685 


TRANSCRIPTION INITIATION 
FACTOR I IB SIGNATURE 


PR00685A 13.62 4.676e- 
09 21-42 


1155 


PD01652 


RECEPTOR CFT.T. WK* "" 
GLYCOPROTEIN IMMUNOGLOB . 


PD01652B S.50 9.396e- 
10 522-574 PD01652B 
8.50 9.463e-10 740-792 


1157 


PD02894 


HYDROLASE N4- PRECURSOR 


PD02894A 21.96 7.B73e- 
28 81-127 PD02894B 
13.93 1.188e-27 178- 
211 


1159 


BL00623 


CMC oxidoreductases 
proteins . 


BL00623E 15.00 3.531e- ~ 
20 391-414 BL00623C 
10.86 4.240e-20 155- 
176 


1161 


PD01937 


DNA PROTEIN POLYMERASE 
BWDUNUCJUEASE DNA- . 


PD01937A 6.68 3.47Se- 
09 330-341 


1162 


PD01937 


DNA PROTEIN POLYMERASE " 
ENDONUCLEASE DNA- . 


PD01937A 6.68 3.47Se- 
09 221-232 


1163 


PR00624 


HISTONE H5 SIGNATURE 


PR00624D 11.94 7.455e- 
10 214-239 PR00624D 
11.94 1.961e-09 312- 
337 


1167 


BL00226 


Intermediate filaments 
proteins • 


BL0O226B 23.86 7.384e- 
09 302-350 


1177 


BL01032 


Protein phosphatase 2C 
proteins • 


BL01032G 8.33 1.422e- 
10 34-48 


1178 


PR00320 


G- PROTEIN BETA KD-40 


PR00320A 16.74 1.794e- 
10 205-220 PR00320C 
13.01 7.840e-10 205- 
220 PR00320B 12.19 . 
8 . 457e-10 35-50 
PR00320A 16.74 7.146e~ 
09 35-50 PR00320B 
12.19 9.l00e-09 79-94 


118Q 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454D 10.89 4.150e- 
19 765-784 


1181 


BL00291 


Prion protein. 


BL00291A 4.49 8.962e- 
11 152-187 


1184 


BL0072O 


Guanine-nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 4.103e- 
18 1089-1113 


1185 


BLC0215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 4 . 553e- "" 
13 204-229 BL00215A 
15.82 Z.429e-12 11-36 
BL00215A 15.82 9.809e- 
11 104-129 


1187 


BLO0983 


Ly-6 / u-PAR domain 
proteins . 


BL00933C 12.69 2.761e- 
10 77-93 


1188 


BLQQ&7& ' 


Oxn/DAP/Arg 

decarboxylases family 2 
pyridoxal-P attachment 
si. 


BL00878B 10.95 6_000e- 
16 189-204 BL00878C 
17.74 8.435e-15 225- 
245 BL00878F 19.67 
3.62Se-13 379-402 
BL0087BD 16.56 1.621e- 
09 270-289 


1191 


PD0293 9 


PROTEIN GLUTATHIONE 
SYNTHETASE SY. 


PD02939B 10.10 2.723e- 
12 203-220 PD02939C 
20.01 l.OOOe-ll 224- 
252 


1193 


PRO 03 4 5 


STATHMIN FAMILY 
SIGNATURE 


PR0034SB 7.12 2.800e- 
28 72-101 PR00345E 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








8.54 7.652e-28 149-174 
PR00345C 4.54 9.100e- 
28 101-125 PR00345D 
10.97 1.964e-24 125- 
149 PR00345A 13.46 
5.645e-16 43-62 


1194 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345B 7.12 2.800e- 
28 108-137 PR00345E 
8.54 7.652e-28 185-210 
PR00345C 4.54 9.100e- 
28 137-161 PR00345D 
10.97 1.964e-24 161- 
185 PR00345A 13.46 
5.645e-l6 79-98 


1195 


PP00995 


Seel family. 


PF00995B 17.37 1.120e- 
13 224-264 


1196 


BL00932 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 6 . 738e- 
11 15-47 


1197 


BL01298 


Di hydrodipi col ina t e 
reductase proteins. 


BL01298A 13.90 5.959c- 
09 51-73 


1203 


BL00061 


Short -chain 

dehydrogenases /reductase 
s family proteins. 


BL00061B 25.79 l.OOOe- 
14 152-190 


1204 


PR00118 


BETA- LACTAMASE CLASS A 
SIGNATURE 


PR00118F 16.42 9.336e- 
09 213-229 


1206 


BL01183 


ubiE/COQ5 

methyl transferase family 
proteins . 


BL01183B 21.31 a.429e- 
37 184-229 BL01183D 
27.71 8.535e-27 264- 
307 BL01183A 13 .25 
3.250e-23 51-73 
BL01183C 10.77 5.295e- 
09 246-258 


1208 


BL00979 


0 -protein coupled 
receptors family 3 
prote ins . 


BL00979L 20.63 2.4 85e- 
09 105-146 


1209 


PFC0023 


Ank repeat proteins. 


PF00023A 16.03 4.857e- 
11 49-65 P700023B 
14.20 1.8l8e-09 45-55 


1212 


PR00 048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.750a- 
14 227-241 PR00048A 
10.52 4.3l6e-ll 199- 
213 


1213 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 1.720e- 
10 20-42 PR00450C 
12,22 3.506e-09 56-78 
PR00450D 16.58 6.769e- - 
09 44-64 


1216 


BL00412 


Neurotnodulin (GAP-43) 
proteins . 


BL00412D 16.54 5.598e- 
10 179-230 


1219 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PRQ0456E 3.06 5.348e- 
11 249-264 


1222 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BINDI . 


PD00066 13.92 7.231e- 
15 295-308 PD00066 
13.92 7.231e-15 406- 
419 PD00066 13.92 
2,286e-12 378-391 
PD00066 13.92 7.857e- 
12 434-447 PD00066 
13.92 3.348e-ll 350- 
363 


1223 


BL50058 


G-protein gamma subunit 
profile. 


BL50058 27.23 l.OOOe- 
40 13-61 


1226* 


BL00412 


Ne ur omoduli n { GAP -43) 
proteins. 


BL00412D 16.54 8.439e- 
09 279-330 


1227 


BL00437 


Catalase proximal heme- 
ligand proteins. 


BL00437A 18.82 l.OOOe- 
40 49-101 BL00437B 
16.28 1.0Q0e-4D 114- 
168 BL00437C 21.86 
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CCVl TH Kin - 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








l.OOOe-40 190-239 
BL00437D 25.72 l.OOOe- 
40 248-301 BL0O437E 
23.95 1.000e-40 327- 
379 


L230 


^>±JW X X O \J 


ftllICO JL.ll X.LljilL, wHain 

repeat proteins. 


J3.UU JLJLOUJ3 1?.34 o.^7/e- 

10 S-60 


1231 




FAMILY 8 SIGNATURE 


i'ROO/JbA 11,19 6.857e- 
09 391-405 


1232 




v*a\J IKyJetilli v*I 1USOL 
FACTOR P40 SIGNATURE 


PR0 0497A 6.92 5.553e- 
10 158-176 


1233 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P4 0 SIGNATURE 


PR00497A 6.92 5.553e- 
10 158-176 


1235 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins . 


BL00866B 36.29 2.776e- 
09 75-121 


1237 


BL00027 


•Homeobox 1 domain 
proteins . 


BL00027 26.43 1.818e- 
21 36-79 


1243 


PR00403 


WW DOMAIN SIGNATURE 


PR00403B 12.19 l,184e- 
11 10-25 


1246 


PD01168 


SYNTHETASE LIGASE 
PROTEIN ALANYL. 


PD01168L 9.47 2.837e- 
10 31-46 PD01168L 
9.47 4.490e-10 174-189 
PD01168L 9.47 7.612e- 
10 183-198 


1249 


BL00018 


EF-hand calcium-binding 
domain proteins . 


BL00018 7.41 2.800e-10 
183-196 


1254 


BL00183 


Ubiquitin-conJ ugating 
enzymes proteins . 


BL00183 28.97 2.440e- 
36 96-144 


1255 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BLOillSA 10.22 5.670e- 
11 8-52 


1256 


BL00373 


Pho sphor ibo sy lg ly c i nami d 
e formyl transferase 
proteins. 


BL00373C 10.35 3 .348e- 
12 143-156 


1258 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011B 13.08 3.217e~ 
10 174-193 


1259 


BL00S1S 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 8.286e- 
10 31-40 


1261 


PR0O07O 


DIHYBROFOLATE REDUCTASE 
SIGNATURE 


PR00070D 11.63 l.OOOe- 
15 112-127 PR00070C 
13.09 9.500e-15 51-63 
PR00070A 12.92 S.SOOe- 
12 16-27 


1262 


BL00462 


Gamma - 

glutamyl transpeptidase 
proteins. 


BL00462A 20.89 6.438e- 
24 140-183 BL00462B 
17.88 5.500e-20 230- 
267 BL00462C 27.41 
2.023e-ll 292-347 


1263 


BL00038 


Myc-typc, 'helix- loop- 
helix* dimerization 
domain proteins. 


BL00D38B 16.97 9.455e- 
11 62-83 


1264 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.670e- 
11 17-61 


1266 


PR00837 


ALLERGEN VS/TPX-1 FAMILY 
SIGNATURE 


PR00837C 17.21 2.714e- 
18 165-182 PR00837A 
14.77 4.512e-12 86-105 
PR00837D 11.12 7.577e- 


1269 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 9.308e- 
22 40-63 PR00449E 
13.50 1.000e-16 137- 
160 PR00449D 10.79 
3.520e-ll 102-116 


1270 


BL00276 


Channel forming colic ins 
proteins. 


BL00276A 8.87 l.bOOe- 
09 17-29 


1275 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. 


PD02327C 15.47 9-769e- 
09 220-243 


1276 


PR00412 J 


EPOXIDE HYDROLASE 


PR00412B 12.59 7.894e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


12 119-135 PR00412C 
11.30 1.857e-ll 165- 
179 PR00412A 13.23 
3.400e-ll 100-119 


1277 


PFO0756 


Putative esterase. 


PF00756C 14.12 9.53Be- 
10 127-157 


1279 


BL00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.325e- 
13 128-145 


1280 


BL01220 


Phosphatidylethanolamine 
-binding protein family 
proteins . 


BL01220C 14.75 9,348e- 
15 248-276 




Q v n n tr n o 

BJUO Vbxo 


Zinc finger, C3HC4 type 
(RING finger) , proteins . 


BL00518 12.23 2.286e- 
10 33-42 


1287 


PF00791 


Domain pnesent in ZO-l 
and Unc5-liJee netrin 
receptors. 


PF00791B 28.49 7.182e- 
11 288-343 


1292 


PKQ08Q2 


SERUM ALBUMIN FAMILY 
SIGNATURE 


PR00802B 16. SI 1.610e- 
10 81-105 


1297 


PR00716 


M- PHASE INDUCER 
PHOSPHATASE SIGNATURE 


PR00716C 17.65 5.696e- 
09 23-44 


1298 


BL00478 


LIM. domain proteins. 


BL00478B 14.79 6.478e- 
14 268-283 


1301 


BL00127 


Pancreatic nbonuc lease 
family proteins. 


BL00127C 31.49 3 .571e- 
28 82-126 BL00127B 
26.57 8.800e-2a 23-68 


1302 


PRO 063 7 


TYPE 3 BOMBESIN RECEPTOR 
SIGNATURE 


PR00637E 11.27 4.250e- 
09 290-306 


1307 


BLQ0215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15. B2 S.SOOe- 
17 13-38 BL0O215A 
15.82 1.000e-16 226- 
251 BL00215A 15.82 
2.658e-13 107-132 


1308 


PR00898 


VASOPRESSIN V2 RECEPTOR " 
SIGNATURE 


PR00898H 11.34 4.682e- 
09 552-572 


1309 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM- B I . 


PD00301B 5.49 2.731e- 
09 390-401 


1310 


BL00983 


Ly-6 / u-PAR domain 
proteins . 


BL00983C 12.69 9.654e- 
13 73-89 BL00983B 
8.19 3.132e-09 12-22 


1313 


BL00194 


Thioredoxin family 
proteins . 


BL00194 12.16 1.900e- 
11 15-28 


1314 


BL00594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 8.969e- 
10 53-97 


1316 


BL00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.32Se- 
13 128-145 


1320 


BL00783 


Ribosomal protein LI 3 
proteins . 


BL00783C 22.43 6.559e- 
24 87-117 BL00783A 
14.55 1.600e-19 8-33 
BL00783B 12.76 3.500e- 
12 74-86 


1327 


PF00514 


Armadillo/beta-catenin- 
like repeat proteins. 


PF00514A 31.30 7.268e- 
11 82-120 


13~29 


obUUUJ U 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 6.294e- 
11 129-148 BL00030B 
7.03 4.789e-09 168-178 


1331 


PR00497 


FACTOR P40 SIGNATURE 


09 25-43 


1332 


PR001G1 


NICKEL - DEPENDENT 
HYDROGENASE/B-TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e- 
09 317-337 


1333 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.769e- 
33 10-49 


1336 


PR0070O 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 2.200e- 
09 262-281 


1337 


PR00700 


PROTEIN TYROSINE 


PR00700D 12.47 2.200e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PHOSPHATASE SIGNATURE 


09 211-230 


X340 


PR00860 


VERTEBRATE 

ME TALLOTH I ONE IN 

SIGNATURE 


PR0086OA 5.46 5.034e- 
13 5-18 


1341 


BL00893 


mutT domain proteins. 


BL00893 18.99 6.750e- 
16 46-71 


1343 


BL.01282 


BIR repeat proteins. 


BL01282B 30.49 5.974e- 
21 383-422 


1344 


DM00d99 


4 kw A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINE . 


DM00099B 14.73 8.313e- 
09 417-427 


134 5 


BL00923 


Aspartate and glutamate 
racemases proteins. 


BL00923B 11.41 5.935e- "~ 
in lie iytc 


1348 


PF0O651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 7.231e- 
13 44-57 


1350 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 3.571e- 
32 416-445 PR00193C 
12.60 6.318e-3l 179- 
207 PR00193B 11.69 
3.571e-24 133-159 
PR00193E 19.47 9.069e- 
22 470-499 PR00193A 
15.41 1.783e-20 77-97 


13 52 


PR00447 


NATURAL RESISTANCE- 
ASSOCIATED MACROPHAGE 
PROTEIN SIGNATURE 


PR00447E 9.73 1.554e- 
15 299-319 PR00447D 
13.54 3.408e-15 200- 
224 PR00447A 12.73 
6.357e-ll 97-124 
PR0044 7G 6.69 9.877e- 
10 353-373 


1353 


BL00303 


S-100/ICaBP type calcium 
binding protein. 


BL003Q3A 21.77 6.667e- * 
26 45-82 BL00303B 
26.15 1.0D0e-24 93-130 


1355 


BL00039 


dependent he li cases 
proteins . 


BL00039D 21.67 5.950e- 
29 375-421 BL00039A 
18.44 7.136"e-29 99-138 
xsJjUUUJyC 15.6.-1 4.000e- 
18 225-249 BL00039B 
19.19 3 ,182e-14 141- 
167 


1357 


PF00615 


Regulator of G protein 
signalling domain 


PF00615B 16.25 2.216e- 
12 84-101 PF00615C 
10.06 S.412e-12 162- 
176 


1360 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.234e- 
29 10-49 


1361 


PR00925 


PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925A 5.47 5.091e- 
18 14-29 PR00925B 
3.73 6.143e-14 29-42 
PR00925C 5.57 4.789e- 
12 53-64 PR00925D 

'J.JO J— »03/ti _L\J / © — Of 


1362 


BL01272 


GlucoJcinase regulatory 
protein family proteins . 


BL01272B 19.61 6.870e- 

iao-i/1 B.UQ1272C 
11.68 3.314S-25 249- 

*i ' t JDJLfl/ A.*. J O . *± if 

1.231e-18 99-117 


1363 


BL01272 


Giucokinase regulatory 
protein family proteins. 


BL0I272B 19.61 6.670e- 
30 113-148 BL01272C 
11.68 3.314e-25 226- 
251 BL01272A 6.49 
1.23le-18 76-94 


1364 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.304e- 
09 167-177 


1368 


PR0O169 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 1.592e- 
09 76-96 


1370 


PR00988 


LJRIDINE KINASE SIGNATURE 


PR0098 8A 6.3 9 1.794e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








10 1-19 


1371 


BL00242 


Integrins alpha chain 
proteins . 


BL00242B 8.13 8 . 615e- 
09 469-479 


1372 


PR0062S 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625B 13.48 7.353e- 
19 46-67 PR00625A 
12.84 I.391e-16 14-34 


1373 


BL00434 


HSF-type DNA- binding 
domain proteins . 


BL00434C 23.85 3.778e- 
09 90-130 


1374 


PR00962 


LETHAL (2) GIANT LARVAE 
PROTEIN SIGNATURE 


PR00952C 8.00 6.337e- 
09 505-526 


1375 


PD02475 


MUCIN EPITHELIAL TUMOR- 
ASSOCIATE . 


PD02475A 23.18 B.552e- 
10 1111-1150 


1376 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.571e- 
32 24-63 


1380 


BLO0194 


Thioredoxin family 
proteins. 


BL00194 12.16 8.333e- 
12 43-61 


1381 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III . 


DM01970B 8.60 1.458e- 
15 1123-1136 


1383 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 
243-254 


1384 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 
271-282 


1385 


BL00303 


S-lOO/ICaBP type calcium 
binding protein. 


BL00303B 26.15 6.2Q3e- 
10 95-132 


1386 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 S.042e- 
09 1574-1628 


1387 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 l.OOOe- 
11 52-61 


1389 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER KETAL- 
BINDING NU. 


PD01066 19.43 3.600e- 
30 10-49 


1390 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 3.S12e- 
31 32-71 


1392 


PRD0308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 9.723e- 
10 127-137 


1393 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14 -IS 9.625e- 
25 88-110 PR00380D 
9.93 2.406e-20 304-326 
PR00380B 12.64 4.414e- 
16 208-226 PR00380C 
13;18 6.538e-16 243- 
262 


13 94 


PD00066 


PROTEIN ZINC- FINGER 
METAL -BINDI . 


PD00066 13.92 3.400e- 
14 462-475 PD00066 
13.92 8.800e-14 348- 
361 PD00066 13.92 
9.571e-12 405-418 • 
PD00066 13.92 6.087e- 
11 490-503 PD00066 
13.92 8.043e-ll 320- 
333 


139B 


PD01066 


PROTEIN ZINC FINSER 
ZINC -FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.786c- 
32 10-49 


1400 


DM01206 


CORONAVIRUS NUCLEOCAPSID 

DP/TPPTV 


DM01206B 10.69 7.038e- 

no T7n o an 


1406 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930A 25.62 7.324e- 
15 363-389 


1407 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL0003 OA 14.3 9 7.500e- 
10 457-476 


1408 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019A 11.19 9.550e- 
11 179-193 PR00019A 
11.19 8.826e-l0 228- 
242 PR00019B 11.36 
1.360e-09 199-213 
PR00019B 11.36 4-960e- 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 








09 176-190 


1409 


PR00510 


NEBULIN SIGNATURE 


PR00510A 9.09 4.l50e- 
12 182-202 PR00510B 
12.96 8.767e-12 210- 
230 PR00510F 9.88 
8.172e-10 58-75 
PR00510D 9.21 2,367e- 
09 251-267 


1410 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR . 


PD00078B 13.14 5.696e- 
09 31-44 


1412 


BL00358 


Ribosomal protein L5 
proteins. 


BL00358B 22.76 l.OOCe- 
40 57-103 BL00358C 
13.75 6.087e-14 122- 
136 BL00358D 14.26 
5.500e-13 143-158 
BL00358A 13.06 1.931e- 
11 33-44 


1414 


BL00282 


Kazal serine protease 
inhibitors family 
proteins . 


BL00282 16.88 7.338e- 
10 511-534 


1415 


BL0 0023 


Type II fibronectin 
collagen-binding domain 
proteins. 


BL00023 24.31 4.300e- 
29 40-77 


1417 


PRO 0681 


RIBOSOMAL PROTEIN SI 
SIGNATURE 


PR00681G 12.54 2.149e- 
09 38-60 


1418 


DM00973 


3 Jtw RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE . 


DM00973A 21.17 1.462e- 
09 171-208 


1419 


PRO 03 19 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 1.571e- 
09 428-443 


1420 


PD01941 


TRANSMEMBRANE 
COTRANS PORTER SYMP. 


PD01941A 14.81 l.OOOe- 
40 142-196 PD01941B 
15.02 7.049e-30 400- 
447 PD01941E 15.92 
2.475e-20 817-864 
PD01941C 19.96 3.118e- 
19 488-543 PD01941D 
27.18 9.614e-18 641- 
690 PD01941F 28.52 
S.382e-15 1038-1093 


14 22 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 8.043e- 
12 199-217 


1423 


PRO 02 09 


ALPHA/BETA GLIADIN 
FAMILY SIGNATURE 


PR00209B 4.88 6,318e- 
11 1009-1028 


14 24 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002A 14.19 8.200e- 
14 367-386 BL50002A 
14.19 9.2S0e-12 298- 
317 BL50002A 14.19 
4.462e-ll 208-227 
BL50002B 15.18 l.OOOe- 
09 244-258 




nt?ry nf no 


PHD-f inger . 


PF00628 15.84 3.045e- 
12 330-345 


142 S 




PHD- finger . 


PF00628 15.84 3.045e- 
12 377-392 


1427 


PR0040S 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5.114e- 
16 281-299 PR00405A 
17.71 4.306e-14 262- 


1428 


BL00039 


DEAD- box subfamily ATP- 
dependent hel leases 
proteins . 


BL00039D 21.67 5.219e- 
34 147-193 


1429 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 8.920e- 
10 577-592 


1430 


PR00378 


INOSITOL PHOSPHATASE 
SIGNATURE 


PR0O378D 16.86 7.563e- 
12 295-314 PR00378B 
13.80 8.650e-10 166- 
186 


1431 


PJR00928 


GRAVES DISEASE CARRIER 


PR00928B 13.53 3.769e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PROTEIN SIGNATURE 


10 103-124 


1433 


BL01113 


Clq domain proteins. 


BL01113B 18.26 7.049e- 
15 14-50 BL01113C 
13.18 7.000e-12 82-102 


1434 


PR00319 


BETA G- PROTEIN 

{ TRANSDUCIN) SIGNATURE 


PR00319B 11.47 7.983e- 
10 135-150 


1436 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 l.OOOe- 
12 84-103 


1438 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2.500e- 
09 250-268 BL00290A 
20.89 4.000e-09 1B8- 
211 


1440 


PRO 08 06 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 38-52 


1441 


PR00806 


VINCUIilN SIGNATURE 


PRD0806B 4.28 4.960e- 
09 88-102 


1444 


BL00422 


Granins proteins. 


BL00422D 19.48 l.OOOe- 
08 114-138 


1445 


PD01841 


PHOSPHORYLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 73-123 PD01841B 
14.35 l.OOOe-40 144- 
185 PD01841D 17. 87 
1.000e-40 206-258 
PD01841F 13.36 l.OOOe- 
40 296-345 PD01841G 
24.26 l.OOOe-40 349- 
403 PD01841I 23.00 
l.OOOe-40 494-536 
PD01841J 14.94 l.OOOe- 
40 895-932 PD01841L 
18.42 l.OOOe-40 1083- 
1125 PD01841E 18.60 
9.719e-38 258-296 
PD01841K 14.81 l.OOOe- 
35 1041-1071 PD01841H 
21.30 3.189e-31 435- 
472 PD01841C 13.78 
l.OOOe-25 1B5-206 
PD01841M 10.82 1.250e- 
20 1175-1194 


1446 


PF00816 


H-NS his tone 'family . 


PF00816B 13.84 8.875e- 
09 190-220 


1447 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PRO004BA 10.52 2.080e- 
09 402-416 


1448 


DM00315 


072 RIBONUCLEASE 
INHIBITOR. 


DM00315D 18.40 7.393e- 
09 23-67 


1451 


BL00030 


Eukaryotic RNA-binding 
region RNP-l proteins. 


BL00030B 7.03 2.800e- 
10 94-104 


1454 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688D 13.44 7.146e- 
09 382-405 


1455 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 2.929e- 
22 4-59 


1457 


BL00927 


Trehalase proteins. 


BL00927C 10.83 8.085e- 
09 42-53 


1460 


BL00545 


Aldose 1-epimerase 
proteins. 


BL00545C 11.28 7.353e~ 
17 169-182 BLO0S45A 
10.20 2.071e-15 73-89 
BL00545B 13.10 3.942e- 
09 140-153 


1466 


PR00097 


ANTHRANILATB SYNTHASE 
COMPONENT II SIGNATURE 


PR00097C 9.42 9,069e- 
09 233-245 


1472 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins. 


BL01129E 13.25 5.250e- 
22 170-195 BL01129C 
25,56 9.526e-18 63-106 


1473 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.821e- 
09 2114-2145 


1475 


PF00686 


Starch binding domain 
proteins . 


PF00686A 13.45 9.100e- 
09 267-277 
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ALLAoo X\Jvt 

NO. 


JJJioi— KJ.r X J. UN 


RESULTS* 


1477 


trS vUSOD 


if j. KJU&Ljj.fs ^eL*J\sJ%tr dome* J.IT 

proteins . 


ftUUaooA 12.64 7.333e- 
10 466-476 


1478 


BL0 003 0 


Eukaryotic RNA- binding 

A JL.W11 £\Sr4 IT 4» piULClUa » 


BL00030B 7.03 9.400e- 

in /q.ci 
1U *± J O .3 


1479 


DM00406 


GLIADIN, 


DM00406 7.73 8.541e-10 
y^- j us 


1480 


BL00290 


Immunoglobulins and 
major histocompatibility 


BL0O290B 13.17 2.38Se- 
15 69-87 BL00290A 
£u . oj o.Ojie-lx 12-35 


1481 


PRO 0150 


PHOS PHOENOLP YRUVATE 


PR00150F 10-45 9.039e- 
09 21-51 


1482 


PF0 0780 


Domain found in NIXl- 
like kinases, mouse 
citron and yeast ROM. 


PF00780I 14.69 4.825e- 
09 107-137 


1483 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 l.l53e- 
09 108-162 


1485 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.909e- 
25 17-56 


1486 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 1.529e- 
09 34-50 


1488 


BL00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 9.586e- 
10 116-162 


1490 


BL00166 


Enoyl-CoA 

hydratase/isomerase 
proteins . 


BL00166D 22.87 2.607e- 
24 190-226 BL00166C 
18.93 5.500e-14 140- 
167 BIi00lS6B 16.92 
9.357e-ll 93-115 


1491 


BL00452 


Guanylate cyclases 
proteins . 


BL00452D 28.59 3.700e- 
31 63-106 BL00452E 
11.92 3.045e-13 115- 
131 


1492 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019A 11.19 3.667e- 
09 532-546 


1497 


BL0O107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 l.OOOe- 
11 384-400 BL00107A 
18.39 5.345e-ll 322- 
353 


1500 


PF00876 


Ogre family. 


PF00876E 7.99 1.947e- 
10 107-117 


1502 


BL00027 


1 Homeobox 1 domai n 
proteins . 


BL00027 26.43 4.789e- 
24 112-155 


1503 


BL00027 


1 Homeobox 1 doma in 
proteins . 


BL00027 26.43 4.789e- 
24 112-155 


1505 


BL01177 


Anaphylatoxin domain 
proteins. 


BL01177E 20.64 5.800e- 
24 448-475 BL01177C 
17.35 5.333e-19 402- 
421 BL01177B 13.61 
7.840e-16 155-171 
BL01177D 17.50 1.900e- 
15 427-445 


1506 


BL0 0972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins . 


BL00972D 22.55 5.500e- 
14 311-336 BL00972A 
11.93 7.429e-14 48-66 
BL00972E 20.72 8.7S9e- 
10 341-363 


1512 


nT n ft c*s ^ 

Di_iU U3^J 


Sulfatases proteins. 


BLO0S23E 19.27 4.536e- 
22 76-106 BI*00523D 
9,89 1.563e-ll 40-52 
BL00523F 10.85 4.162e- 
09 159-170 BL00523G 
9.46 5.333e-09 256-266 


1516 


BD00914 


Syntaxin / epimorphin 
family proteins. 


BL00914 24.91 7.045e- 
14 168-218 


1518 


BL00600 


Aminotransferases class - 
III pyridoxal -phosphate 
attachment si. j 


BL0060OA 17.98 6.143e- 
19 98-122 BL00600E 
16.43 1.77le-17 302- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








331 BL006COG 12.43 
9.625e-l7 377-396 
BLO06O0B 19.60 5.091e- 
15 160-186 BL00600C 
16.18 6.04Ce-l2 190- 
206 BL006C0F 8.77 
1.000e-ll 343-356 
BL00600D 8.71 l.OOOe- 
10 281-295 


1523 


PD0O930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PDO0930B 33.72 9.600e- 
18 41-82 


1528 


PR0O320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320B 12.19 4.774e- 
11 192-207 PR00320B 
12.19 8.839e-ll 272- 
287 PR00320B 12.19 
9.743e-10 106-121 
PR00320A 16.74 1.878e- 
09 192-207 PR00320A 
16.74 2.317e-09 106- 
121 PR00320A 16.74 
8.683e-09 272-287 
PR00320C 13.01 8.800e- 
09 106-121 


153 8 


DM01970 


0 Jew ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 0.60 4 . 508e- 
15 171-184 


1539 


PF00781 


Diacylglycerol kinase 
catalytic domain 
proteins (presumed) . 


PF00781D 11.11 7.593e- 
10 103-127 


1546 


PR0O965 


OCULAR ALBINISM TYPE 1 
PROTEIN SIGNATURE 


PR0096SH 10.73 1.231e- 
29 312-334 PR00965E 
12.93 5.846e-29 172- 
195 PR00965F 5.98 
1.123e-28 209-231 
PR00965C 15.04 l.OOOe- 
27 131-151 PR00965D 
5.84 1.000e-27 150-170 
PR00965G 8.52 2.440e- 
27 258-279 PR00965B 
4.80 8.650e-26 88-109 
PR00965A 12.52 l.OOOe- 
25 35-55 PR00965I 
3.91 6.442e-25 385-406 


1541 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 9-7l9e- 
17 163-207 


1543 


PD02699 


PROTEIN DNA- BINDING 
BINDING DNA. 


PD02699C 24.84 l.OOOe- 
40 599-646 PD02699A 
B.91 2.2B6e-34 219-248 
PD02699B 18.28 6.143e- 
21 485-509 


1544 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.857e- 
10 182-197 PR00049D 
0.00 7.102e-09 67-82 


1547 


BL00951 


ER lumen protein 
retaining receptor 
proteins . 


BL00951C 19.35 l.OOOe- 
40 93-142 BL00951D 
13.94 8.714e-40 142- 
177 BL00951A 15.10 
1.000e-38 2-38 . 
BJb0u951B x**.£5 6 ,z3Ue- 
33 38-69 


1548 


BL00536 


Ubigui tin-activating 
enzyme proteins. 


BL03536F 13.65 8.920e- 
30 279-318 BL00536D 
22.91 S.737e-24 21-65 
BL00536E 16.94 4.696e- 
18 248-279 


1549 


PRO 013 9 


AS PARAGINASE/ GLUTAM I NAS E 
FAMILY SIGNATURE 


PR00139C 11.72 9.679e- 
09 550-569 


1553 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 5.1l9e- 
09 53-73 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1556 


BiiOooei 


Short- chain 

dehydrogenases/reductase 
s family proteins. 


BL00061B 25.79 6.276e- 
13 67-105 


1557 


BL01228 


Hypothetical cof family 
proteins. 


BL01228D 17.44 8.l05e- 
12 107-132 


1558 


BL0122 8 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.105e- 
12 107-132 


1559 


BL0122S 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.105e- 
12 107-132 


1562 


BL00522 


DNA polymerase family X 
proteins . 


BL00522C 11.90 6 . 600e- 
18 412-436 BL00522B 
27.30 1.738e-16 364- 
410 BL00522A 25.52 
6.000e-16 279-326 
BL00522E 19.63 6.l23e- 
14 502-532 BL00522F 
14.90 2.385e-13 551- 
575 


1563 


PF00651 


BTB [also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 1.947c- 
11 46-59 


1564 


BL00299 


Ubiquitin domain 
proteins . 


BL00299 28.84 2.823e- 
10 324-376 


1566 


BL01013 


Oxys terol - binding 
protein family proteins . 


BL01013D 26.81 8.594e- 
17 184-228 BL01013C 
9.97 4 .906e-12 14-24 


1567 


BL00678 


Trp-Asp {WD} repeat 
proteins proteins . 


BL00678 9.67 3.400e-10 
378-389 BL00678 9-67 
5.800e-l0 418-429 
BL00678 9.67 8.800e-10 
295-306 


1570 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 5.235e- 
17 297-313 BL00479A 
19.86 6.625e-15 271- 
294 BL00479A 19.86 
2.667e-14 147-170 
BL00479B 12.57 6.294e- 
12 173-189 


1576 


PR00665 


OXYTOCIN RECEPTOR 
SIGNATURE 


PR00665G 12.36 4.673e- 
24 364-384 PR00665D 
9,93 1.200e-22 138-155 
PR00665F 11.73 4.00Qe- 
22 337-354 PR00665C ' 
5.89 1.000e-20 65-80 
PR0D665B 5.29 4.337e- 
19 24-39 PRO0665E 
5.60 2.929e-15 246-260 
PR00665A 5.99 5.622e- 
15 11-25 


1577 


DM00O99 


4 kw A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINE . 


DM00099B 14.73 9.308e- 
10 127-137 


1579 


BL00524 


Somatomedin B domain 
proteins. 


BL00524A 9.65 6.776e- 
14 52-73 


1580 


PD02B94 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894B 13.93 6.959e- 
16 182-215 PD02894A 
21.96 2.225e-10 57-103 


1581 


BL00411 


Kinesin motor domain 
proteins . 


BL00411C 15.04 5.292e- 
12 32-54 BL00411H 
15.66 4.44le-ll 245- 
276 


1582 


PR00604 


CLASS IA AND IB 
CYTOCHROME C SIGNATURE 


PR00604A 11.13 2.440e- 
09 79-87 


1584 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 1.000c- 
10 225-238 


1585 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 9.455e- 
11 125-145 


1586 


DM01354 


kw TRANSCRIPTASE REVERSE 
II 0RF2. 


DM01354S 11.61 7.750e- 
09 474-495 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1587 


PR00072 


MALIC ENZYME SIGNATURE 


PR00O72B 13.77 7.955e- 
33 180-210 PR00072A 
12.75 6.040e-25 120- 
145 PR00072C 11.42 
2.286e-24 216-239 
PR00072D 10.77 3.400e- 
22 276-295 PR00072E 
10.54 l,360e-19 301- 
318 PR00072G 10.45 
5.304e-19 433-450 
PR00072F 8.87 5.935e- 
15 332-349 


1589 


BL00191 


Cytochrome b5 family, 
heme -binding domain 
proteins . 


BL00191H 15.64 1.537e- 
22 Sl-113 BL00191K 
17.38 9.027e-12 398- 
44 2 


1590 


DM01970 


0 Jew ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 7.716e- 
13 211-224 DM019V0B 
8.60 2,lS7e-12 94-107 


1591 


DM00517 


5 lev; NUCLEAR 60.7 NUP1 
CHROMOSOME . 


DM00517B 10.96 6.625e- 
16 1175-1193 DM00517A 
8.21 1.000e-ll 1015- 
1026 


1592 


BL00037 


Myb DNA- binding domain 
proteins repeat proteins 
proteins . 


BL00037B 15-92 3.2 50e- 
27 116-142 BL00037A 
16,68 2.500e-24 83-107 
BL00037A 16.68 3.250e- 
12 31-55 B^00037B 
15.92 3.526e-ll 64-90 
BL00037C 16.86 9.654e- 
10 146-164 


1595 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 1.514e- 
09 110-127 


1598 


PF0062S 


PHD -finger. 


PF00628 15.84 3.250e- 
11 1667-1682 


1599 


PRO 00 14 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014D 12.04 5.500e- 
09 980-995 


1S00 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 6-57le- 
10 30-39 


1602 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


Bli00412D 16.54 5.4 02e- 
10 136-187 


1605 


PF00651 


BTB (also known as BR- 
C/Ttk> domain proteins. 


PF00651 15.00 3.571e- 
10 44-57 


1607 


BL00252 


interferon alpha, beta 
and delta family 
proteins . 


BL00252A 18.49 6.657e- 
23 20-57 BL00252B 
19.78 9.125e-16 58-109 


1610 


DM00215 


PROJUINE-RICH PROTEIN 3. 


DM00215 19.43 l.OOOe- 
08 61-94 


1611 


BL00904 


Prote in 

prenyl transferases alpha 
subunit repeat proteins 
proteins. 


BLO09O4C 8.98 7.353e- 
10 91-125 BL00904D 
1.47 6.018e-09 127-168 


16X2 


PF00168- 


C2 domain proteins. 


PF00168C 27.49 3,250e- 
09 365-391 


1613 


BL00412 


Ne ur omodu 1 in ( GAP -43) 
proteins . 


BL00412D 16.54 6.051e- 
09 932-983 BL00412D 
16.54 7.1S3e-09 933- 
984 


1614 


BL00559 


Eukaryotic molybdopterin 
oxidoreductases 
proteins . 


BL00559I 13.63 3.531e- 
25 54-83 BL00559K 
13.17 2.957e-18 197- 
224 BL0C559J 19.63 
6.870e-16 124-176 
BL00559L 13.60 9.000e- 
16 266-284 


1615 


PD01427 


TRANSFERASE 
METHYLTRANSFERASE BI . 


PD01427B 22.45 3.025e- 
22 500-541 PD01427A 
19.94 8.773e-18 439- 
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SEQ ID NO: 


ACCESSION 
NO 


DESCRIPTION 


RESULTS* 








472 


1616 


BL00115 


Eukaryotic RNA 
polymerase II 
heptapeptide repeat 

pi.ULC114a - 


BL00115Z 3.12 7.48Se- 
09 152-201 BL00115Z 
3.12 9.603e-09 145-154 


1617 


BL00303 


S-100/ICaBP type calcium 
binding protein. 


BL00303B 26.15 7.750e- 
32 51-88 BL00303A 
21.77 1 .947e-31 4-41 


1618 


BL01254 


Fetuin family proteins. 


BL01254F 10.02 8.754e- 
09 137-147 


1619 


iru Olaoo 


I PEPTIDE REDUCTASE 
| PROTEIN METHI. 


PD01888B 25.10 l.OOOe- 
40 47-97 PD01888C 
21.56 7.000e-30 125- 
155 PD01888A 12.84 
8,800e-15 7-23 


1621 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 3.455e- 
09 692-704 PR00239E 
1.58 4.580e-09 697-709 
PR00239E 1.58 4.580e- 
09 702-714 PR00239E 
1.58 5.193e-09 703-715 


1622 


PR00860 


VERTEBRATE 

METALLOTHIONEIN 

SIGNATURE 


PR00860B 7.04 1.900e- 
18 27-41 PR00860C 
9.61 1.474e-14 41-51 
PR0O860A 5.46 1.720e- 
14 5-18 


1624 


PR00784 


MITOCHONDRIAL BROWN FAT 
UNCOUPLING PROTEIN 
SIGNATURE 


PR00784D 15.86 8.027e- 
11 77-95 


1626 


BL00325 


Actin-depolymerizing 
proteins . 


BL00325B 21.66 l.OOOe- 
40 93-139 BLQ0325A 
24.83 6,786e-23 61-93 


1631 


BL00064 


L- lactate dehydrogenase 
proteins. 


BL00064B 23.57 l.OOOe- 
40 82-130 3L00064C 
17.28 l.OOOe-40 137- 
182 BL00064E 27.20 
l.OOOe-40 223-275 
BL00064F 25.14 7.882e- 
36 206-331 BL00064A 
21.16 1.000e-33 22-50 
BL00064D 14.19 6.500e- 
31 182-212 


1632 


PR00063 


RIBOSOMAL PROTEIN L27 
SIGNATURE 


PRO00S3B 15.24 9.700e- 
11 59-84 PROO063A 
11.71 1.614e-09 34-59 


1634 


PRO 023 9 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239D 0.00 1.105e- 
11 36-49 PR00239C 
3.51 2.538e-09 37-45 


1636 


BL01210 


Caveolins proteins . 


BLQ1210B 13.92 9.531e- 
10 133-183 


1637 


BtO 0982 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 5.388e- 
11 11-43 


1639 


BL01183 


ubiE/COQ5 

methyl transferase family 
proteins . 


BL01183B 21.31 8.144e~ 
12 132-177 


1640 


PR00015 


GRAM-POSITIVE COCCUS 
SURFACE PROTEIN ANCHOR 


PR00015B 9.84 8.46Se- 
10 128-149 


1641 


PRO 03 20 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320B 12.19 5.935e- 
11 364-379 PR00320A 
16.74 7.828e-ll 364- 
379 PR00320C 13.01 
2.800e-10 279-294 
PR00320C 13.01 2.800e- 
10 364-379 PRD0320B 
12.19 5.114e-10 279- 
294 PROO320A 16.74 
1.659e-09 279-294 
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SEQ ID NO: 


ACCESSION 
NO. 




RESULTS * 








PRO0320A 16.74 2.098e- 
09 229-244 


1642 


PF00023 


AnJc repeat proteins. 


PF00023A 16,03 6.464e 7 
09 114-130 


1643 


PR00169 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 1.806e- 
11 74-94 


1644 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 2 . 200e-10 
109-120 BL00678 9.67 
5.737e-09 528-539 


1645 


BL01108 


Ribosomal protein L24 
proteins . 


BL01108A 20,33 7.366e- 
17 56-89 


164£ 


PRO 03 80 


KINESIN HEAVY dlAIN 
SIGNATURE 


PR00380A 14.18 9.270e- 
21 103-125 PR003BOD 
9.93 6. 30Be-18 '386-408 
PR00380C 13.18 7.923e- 
16 332-351 PRO038OB 
12.64 6.657e-15 292- 
310 


1647 


DM01242 


<j 1 xlrCH LoY J. ii— - 1 liviA 

LIGASE . 


DM01242C 17.15 9.791e- 
37 340-381 DM01242E 
23.00 5.071e-31 463- 
505 DM01242D 23.29 
3.925e-30 420-463 
DM01242B 23.57 8 . 054e- 
18 265-314 DM01242F 
10.61 7.618e-14 526- 
540 


1649 


PD00126 


PROTEIN REPEAT DOMAIN 


PD0Q126A 22.53 5.500e- 
10 13-34 


1651 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.720e- 
11 431-485 


1652 


BL00933 


FGGV family o£ 
carbohydrate kinases 
proteins . 


BL00933A 17.50 4.673e- 
12 11-35 BL00933E 
13.80 9.217e-09 456- 
472 


1653 




Involucrin proteins. 


BL007S5C 17.06 2.988e- 
10 70-115 


1654 — 


rt. fin OHO 


Bacterial- type phytoene 
dehydrogenase proteins . 


BL00982A 18.41 7.750e- 
17 302-334 


16S5 


BL009B2 


Bacterial -type phytoene 
dehydrogenase proteins . 


BL00982A IB. 41 7.750e- 
17 282-314 


1656 


-Q tlwU i *m J, 


Guanine -nucleotide 
dissociation stimulators 

n-j-j^a** i,<t,H\JLj r y SIHUi 


BL0U741B 14.27 1.391e- 
16 607-630 


1657 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 7.938e- 
11 114-136 


1658 


PR00910 


LUTEOVIRUS 0RF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.889e- 
10 442-455 


1659 


BL00972 


TJfai crui tin ^Vw-vwl — 

terminal hydrolases 
family 2 proteins. 


bLUUy /2D £2.55 4.140e- 
12 376-401 BL00972S 
20.72 5.629e-09 446- 
468 


1660 


BL00406 


Act ins proteins. 


BLO04O6D 12.58 6.767e- 
15 188-243 


1*61 


PR00105 


CYTOSINE-SPECTFIC DNA 
METHYLTRANS FERASE 
SIGNATURE 


PR00105A 10.36 4.900e- 
13 1140-1157 PR00105B 
12.32 2.800e-12 1259- 
1274 PR00105C 10.86 
1.000e-10 1305-1319 


1662 


BL00280 


Pancreatic trypsin 
inhibitor (Kunitz) 
family proteins. 


BL00280 24.61 3.172e- 
33 3119-3163 


1663 


PR00319 


BETA G- PROTEIN 
(TRANSDUC IN ) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 5.7l4e-20 89-105 
PR00319A 15.27 5.286e- 
19 51-68 PR00319B 
11.47 8.200e-l9 70-85 
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SEQ ID NO:" 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


16 64 


BL0O018 


EF-hand calcium- binding 
domain proteins. 


BL00018 7.41 b.050e-10 
489-502 


1667 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19,43 8.50Oe- ~ 
38 7-46 


1669 


BL01153 


NOLI /NOP2/ sun family 
proteins . 


BL01153D 19.69 1.188e- 
17 -15-141 BL01153C 
13 .67 8.977e-15 66-80 
BL01153B 20.52 1.885e- 
10 13-37 


1671 


±•±400678 


PI3 KINASE P85 
REGULATORY SUB UNIT 
SIGNATURE 


PR00678H 9.13 3.100e- '" 
10 1146-1169 


1672 


BL00598 


Chromo domain proteins . 


BL00598 14.45 8.500e- 
20 27-49 


1673 


PR00326 


GTP1/0BG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.329e- ' 
09 686-707 


1674 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.580e- 
11 343-358 PRD0049D 
0.00 1.286e-10 342-357 


1676 


PRO 074 7 


GLYCOSYL HYDROLASE 
FAMILY 4 7 SIGNATURE 


PR00747H 12.76 3.636e- 
19 427-448 PR00747G 
14.50 2.286e-18 368- 
393 PR00747C 12.06 
7.500e-18 112-131 
PR0074 7A 14. OS 4.600e- 
17 42-63 PR00747D 
15,23 8.759e-17 163- 
183 PR00747E 15.13 
8.244e-15 254-272 
PR0O747B 7.65 5.355e- 
13 75-90 PR00747F 
13.56 8.714e-10 311- 
328 


1677 


PR00747 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 


PR00747H 12.76 8.636e- 
19 309-330 PR00747G 
14.50 2.286e-18 250- 
275 PR00747C 12.06 
7.500e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747B 
7.65 5.355e-13 75-90 
PR00747F 13.56 8 . 714e- 
10 193-210 


16 80 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 4.600e-10 
406-417 BL00578 9.67 
6.684e-09 320-331 


1681 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 4.6C0e-10 
329-340 BL00678 9.67 
6.684e-09 243-254 


1683 


PR00326 


GTP1/0BG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 1.346e- 
13 389-410 


1685 


PRO 0€4 6 


RDC1 ORPHAN RECEPTOR 
SIGNATURE 


PR00646H 6.32 4.188e- 
09 755-771 


1690 "" ' 


BL01160 


Kinesin light cnain 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 75-129 


1691 


PR00456 


RIBOSOMAL PROTEIN" P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 418-433 PR00456E 
3.06 7.281e-10 419-434 
PR00456E 3.06 8.125e- 
10 420-435 


1692 


PR0045<J 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 7.28le- 
10 487-502 PRO0456E 
3.06 7.2Ble-10 488-503 
PR00456E 3.06 8.125e- 
10 489-504 


1693 


BL00674 " " 
1 


RAA-protem family j 
aroteins . ; 


9L00674C 22.60 8.043e- 
24 274-317 BL00674B 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








4.46 4.000e-23 241-263 
BL00674D 23.41 8.560e- 
18 338-385 BL00674E 
15.24 1.720e-15 414- 
434 


1697 


PR00409 


PHTHALATE DIOXYGENASE 
REDUCTASE FAMILY 
SIGNATURE 


PR00409F 12.70 4.388e- 
10 427-447 


1698 


PR00466 


CYTOCHROME B-245 HEAVY "' 
CHAIN SIGNATURE 


PR00466C 10.17 3.443e- 
13 187-208 PR00466B 
5.03 5.500e-ll 162-186 
PR00466F 9.16 6.l59e- 
09 498-517 


1699 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 9.217e- 
12 283-300 BL00028 
16.07 3.769e-ll 255- 
272 BL00028 16.07 
5.154e-ll 171-188 
BL00028 16.07 S.SOOe- 
11 227-244 BL00028 
16 .07 1.600e-10 199- 
216 


1700 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 3.348e- 
15 62-102 BL01019B 
19.49 4.000e-15 107- 
162 


1703 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.484e- 
12 200-239 


1707 


PROD109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.558e~ 
14 134-153 


1710 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.565e- 
10 116-130 PR00019B 
11.36 4.600e-09 113- 
127 PR00019B 11.36 
7.120e-09 204-218 


1711 


BL01159 


WW/rspS/WWP domain 
proteins . 


BL01159 13.85 6.523e- 
11 232-247 BL01159 
13.85 5.408e-10 613- 
628 


1712 


PF00023 


AnJc repeat proteins. 


PF00023A 16.03 7.000e- 
10 187-203 


1713 


PP00642 


Zinc finger C-xB-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9.550e- 
11 230-241 


1714 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9-550e- 
11 230-241 


1715 


BL01115 


GTP- binding nuclear 
protein ran proteins. 


BL01115A 10.22 7.129e- 
09 7-S1 


1718 


BL00353 


HMG1/2 proteins. 


BL00353C 14.83 6.018e- 
10 136-183 BL00353B 
11.47 8.866e-09 86-136 


1719 '"" 


BLiUU412 


Neuromodulin (GAP- 43 J 
proteins. 


BL00412D 16.54 5.408e- 
09 432-483 


1721 


BLD0038 


Myc-type, 'helix-ioop- 
hel ix 1 dirtier i za t ion 
domain proteins. 


BL00038B 16.97 B.448e- 
12 79-100 BL00038A 
13.61 4.000e-ll 52-68 


1723 


PD00567 


rxxJ L&Xut KJSri~ D INL) UNA 

REPEAT HYD. 


PD00567C 9.17 8.500e- 
09 41B-428 


1724 


BL01279 


Protein-L- 
isoaspartate (D- 
aspartate) 0- 
methyl transferase signa. 


BL01279A 24.27 5.663e- 
12 233-281 


1728 


BL00018 


EF-hand calciuo-binding 
domain proteins. 


BL00018 7.41 2.059e-ll 
73-86 ' BL00018 7.41 
4.176e-ll 157-170 


1730 


BL00594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 1.089e- 
09 17-61 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1731 


BL01160 


Kinesin light chain 
repeat proteins. 


BL0116DB 19.54 9.676e- 
10 296-350 


1732 


BL01160 


Kinesan light chain 
repeat proteins. 


BL01160B 19.54 9.676e- 
10 316-370 


1733 


PF00850 


Histone deacetylase 
family . 


PF00850F 15.70 4.349e- 
22 246-279 PF00850D 
14.76 6.850e-20 177- 
201 PF00850E B.88 
8.691e-18 209-235 
PF00850G 22.75 4.098e- 
14 281-323 


1734 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 5.932e- 
09 292-307 


. 1735 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.263e- 
10 492-Sn:? 


1743 


PRO 04 4 9 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.188e- 
11 5-27 PR00449D 

123 PR00449E 13.50 
9.289e-10 144-167 


1744 


PR00449 


TRANSFORMING PROTEIN F21 " 
RAS SIGNATURE 


PR00449A 13.20 1.188e- 
Xi 3-<S / PRO0449D 
10.79 2.241e-10 109- 
123 PR00449B 13.50 


1745 


BL0072O 


Guanine-nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 8.297e- 
15 136-1 RCi 


1746 


PR00081 


GLUCOSE/ RIB I TOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.727e- 
11 45-57 PR00081E 
17.54 3.935e-10 150- 
168 


1747 


BL00439 


Acyl transferases 
ChoActase / COT / CPT 
family proteins. 


BL00439H 18.24 8.43ST= 

14 BT.nniiQrt 

-t"* v j j. OXj U 17 ** J jo 

13.40 2.895e-12 3-14 


1749 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATDRE 


11 4-20 


1751 


PD0006S 


PROTEIN ZINC- FINGER 
METAL -BIND I. 


PD00066 13.92 3.400e- 
14 33-46 PD00066 
13.92 1.000e-13 89-102 
PD00O66 13 . 92 7 . OOOe- 
13 61-74 PD00066 
13.92 6.571e-12 117- 
130 


1753 


BL01013 


Oxysterol-ninding 
protein family proteins. 


BL01013D 26.81 6 . 516e- — 
18 33-77 


1754 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.393e-' 
09 490-521 BL00790I 
20.01 2.821e-09 60-91 
BL00790I 20.01 6.357e- 
09 287-318 


1756 


PD01066 


PROTEIN ZINC FINGER 
2 INC- FINGER M3TAL- 
BINDING NU. 


PD01066 19.43 9.7S0e- 
35 10-49 


1758 


1JM00406* 


GL IAD IN. 


DM00406 7.73 7.600e-09 
653-666 


1762 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR J. 


PD02929A 28.27^ 4.529e- " 
09 224-278 


176S 


PR00326 


yTfl/oBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 5.950e- 
11 146-157 


1775 


PFO0023 


auk repeat proteins. 


PF00023A 16.03 3.077e- 
14 523-539 


1776 


BL00942 


gripT family of 
transporters proteins. 


BL00942F 15.07 4.343e- " 
10 371-389 BL00942B 
20.36 8.040e-09 94-137 


1777 


UM00215 


PROLINE-RICH PROTEIN 3. J 


DM00215 19.43 2.373e- 
09 279-312 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1778 


BL00084 


Copper type II, 

a s corba t e - dependent 

monooxygenases proteins. 


BL00084D 25.11 3.700e- 
20 169-224 BL00084B 
24.26 8.134e-16 10-58 
BL00084C 27.71 8.412e- 
11 107-15B 


1779 


BL01013 


Cocysterol -binding 
protein family proteins. 


BL01013D 26.81 3.758e- 
18 611-655 BL01013A 
25.14 2.831e-15 344- 
380 BIt01013C 9.97 
6.308e-13 435-445 
BL01013B 11.33 3.717e- 
12 409-420 


1783 


BL0 0741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 


1784 


BL00741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8,138e- 
13 492-515 



♦results include in order: accession number subtype; raw score; p-value; postion of 
signature in amino acid sequence. 
TRADOCS: 1 4 1 6223. 1 (%CRJ0 1 !.D0C) 
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TABLE 4 



NO : 


PFAM NAME 


DESCRIPTION 


p -value 


PFAM 
SCORE 


2 




Immunoglobulin domain 


2.1e-32 


109.5 


3 


pkinase 


Eukaryotic protein kinase 
domain 


1.3e-29 


110.7 


4 




Zinc finger, C2H2 type 


1.6e-21 


84 .9 


5 


rn3 


Fibronectin type III domain 


0 


1097.1 


6 


fn3 


Fibronectin type III domain 


0 


1035.0 




inj 


Fibronectin type III domain 


0 


1090.4 


8 


fn3 


Fibronectin type III domain 


0 


1097.1 


9 


TBC 


TBC domain 


4e~40 


14 6.7 


10 


p450 


Cytochrome P450 


9.5e-17 


62.0 


12 


ank 


Ank repeat ~~~ 


6e-20 


79.7 


14 


ig 


Immunoglobulin domain 


1.7e-05 


22 .7 


15 


zf-MYND 


MYND finger 


1.3e-06 


35,4 


1$ 


Zt-MYND 


MYND finger 


1 .3e~06 


35.4 


17 


zf-C2H2 


Zinc finger* C2H2 type 


1.7e-99 


343.9 


18 


CAPJ3LY 


CAP-Gly domain 


1 . 2e-25 


98.7 


20 


IMPDH_C 


IMP dehydrogenase / GMP 
reductase C terminus 


l.Se-119 


410.5 


21 


IMPDH_C 


IMP dehydrogenase / GMP 
reductase C terminus 


4 3e-i02 


352 6 


22 


pkinase 


Eukaryotic protein kinase 
domain 


2 . 4e-79 


277 . 0 


23 


pkinase 


Eukaryotic protein kinase 
domain 


8 . 4e-74 


258.6 


25 


RNA_jpol_A 


RNfA polymerase alpha subunit 


0 


1077 . 7 


26 


Clq 


Clq domain 


1 .9e-l0 


44 .4 


27 


Ribosomal 1*2 
3 


Ribosomal protein L23 


7,8e-32 


111.2 


28 


Ribosomal_I>2 
3 


Ribosomal protein L23 


le-29 


104 .2 


30 


Zf-A20 


A20-like zinc finger 


1 .5e-10 


48 .5 


31 


zf -A20 


A20-like zinc finger 


I.5e-10 


48.5 


32 


FMN dh 


FMN- dependent dehydrogenase 


5 -4e-l79 


608.1 


34 


PID 


Phospho tyrosine interaction 
domain (PTB/PID) 


3 ,8e-59 


209.9 


35 




Immunoglobulin domain 


1.4e-13 


48. 8 


36 


ig 


Immunoglobulin domain 


1 .4e-13 


48. 8 


40 


kinesin 


Kinesin motor domain 


6.7e-76 


265.6 


44 


Ets 


Ets- domain 


1.4e-56 


182.1 


45 


fits 


Ets -domain 


1. 4e-56 


182.1 


46 


LRR 


Leucine Rich. Repeat 


1.7e-13 


58.3 


48 


z£-C2H2 


Zinc finger, C2H2 type 


2 .3e-162 


552.8 


49 


IT AM 


Immunoreceptor tyrosine -based 
activation mot 


1.4e-05 


31.9 


50 


UCH-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


l.le-26 


102.0 


51 


UCH-2 


Ubiquitin carboxyl -terminal 
hydrolase family 


l.le-26" 


102.0 


52 


ras 


Ras family 


8.5e-45 


162.3 


53 


PRK 


Phospho ribulokinase 


2.1e-65 


230.7 


54 


myb_DNA- 
binding 


Myb-like DNA-binding domain 


0.096 


15 .2 


55 


voltage CLC 


voitaga gated chloride channels 


3.3e-186 


631.9 


56 


sugar_tr 


Sugar <and other) transporter 


0.00015 


-64.3 


57 


TBC 


TBC domain 


2.2e-37 


137.6 


58 


ank 


Ank repeat 


S.9e-2S 


96.3 


59 "— 


ank 


Ank repeat 


5.9e-25 


96.3 


67 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


7.9e-49 


175.6 


68 


C2 


C2 domain 


7.9e-S4 


192.2 




C2 


C2 domain 


2.3e-54 


194.0 


70 1 


Kelch 


Kelch motif 


9.4e-99 


341.5 


72 




Immunoglobulin domain 


8.2e-28 


94.7 


73 


pkinase 


Sukaryotic protein kinase 


8e-69 


242.1 



246 



WO 01/53312 



PCT/US00/34263 



SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 






domain 






74 


pkina.se 


Eukaryotic protein kinase 


2 ,8e-3B 


140.6 


76 


' "z£- 

C4 Topoisom 


Topoisomerase DNA banding C4 


5.4e-54 


192.8 


83 


Peptidase_S9 


Prolyl ol igopeptidase family 


4.3e-l0 


36.8 


84 


fn3 


riDroneccin type III domain 


4 . le-51 


183 .2 


86 
$8 


SH2 
lg 


Src homology domain 2 
± uuiiLiiiC'y jl odu J. in domain 


3 .le-22 
0-0091 


67.7 
14 .0 


09 
92 


WD40 

laminin G 


WD domain, G-beta repeat 
Laminin G domain 


2 -le-21 
6" .le-27 


84 .6 
98.5 


93 




AMP -binding enzyme 


2 ,4e-l3 


-37.2 


95 


pkinase 


Eukaryotic protein kinase 
domain 


1.4e-59 


211.4 


~96 


pkinase 


nujtaxyocic protein kinase 
domain 


2 . 6e-5l 


183 .9 


97 
98 


adh short 
kinesixi 


short chain dehydrogenase 
Kinesm motor dontiain 


2e-61 
2 .2e-86 


217.5 
300.4 


101 
102 


IRS 
AAA 


oomairi (iRS-i type) 
ATPases associated with various 
cellular act 


5.4e-36 
S. Be-05 


133.0 
-5.2 


104 
106 


pkinase 


cuxaryocic protein kinase 

domain 

Ras family 


2 . 7e-73 
8.3e-24 


256.9 
92.5 


107 
108 


FYVE 

Cyt_reductas 


FYVE zinc finger 

FAD/NAD -binding Cytochrome 

reductase 


5.4e-27 
7.7e-61 


100.7 
215.5 


109 


zf-C2H2 


Zinc finger, C2H2 type 


2.3e-122 


420.0 


113 


pkinase 


Eukaryotic protein kinase 
domain 


4e-88 


306.2 


11S 


PH 


PH domain 


3. le-11 


45.2 


117 


lipocalin 


Lipocalin / cytosolic fatty- 
acid binding pr 


2. 4e-i4 


53.5 


118 
120 


pkinase 

WD4 0 


Eukaryotic protein kinase 
domain 

WD domain, G-beta repeat 


4.5e-20 
2.4e-14 


76.3 
61.1 


121 
123 

124 


WD40 
2 


wd domain, G-beta repeat 
eIF4-gamma/eIF5/eIF2-epsilon 

Immunoglobulin domain 


2.4e-14 
ie-32 


61.1 
122 .2 


127 


ml to carr 


Mitochondrial carrier proteins 


6.5e-08 
3e-16 


30.6 
58.6 


12 B 


PP2C 


Protein phosphatase 2C 


2 .2e-71 


250.6 


130 


ATP1G1 PLM M 

ATS 

pfkB 


ATP1G1/PLM/MAT8 family 

pfKB family carbohydrate kinase 


3 . le-20 
4 -5e-42 


80 . 6 
137.1 


133 
134 


ACBP 


Acyl coA binding protein 
recognition motif. 


4.6e-22 
l.2e-3l 


85.7 
118.5 


13 5 


IQ 


IQ calmodulin-binding motif 


2 .6e-08 


41.0 


13 6 


ATP1G1 PLM M 
AT 8 




9 .3e-22 


85.7 


139 
140 


WH2 

zf-C2H2 


*» «*• bj^u i- u nj.u xi c n syncir ome 
homoloctfv recti on 5 

Zinc fincrer. C2H2 t>vnp 


0 .0067 


23 . 1 


141 

143 
"146 


Peptidase S2 
6 

art 


Signal peptidase I 
ADP-ribosylation factor family 


1 . 7e - 82 
5.7e-10 

1.2e-39 


287 . 5 
35.7 

145.2 


14a 


KRAB 
DUF6 


KRAB box 

Integral membrane protein DUF6 


7 .3e-30 
0.096 


112 . 6 

8.0 


149 


PDEase 


3' 5' -cyclic nucleotide 
phosphodiesterase 


3.8e-80 


231 .1 


151 


S4 


S4 domain 


l.le-08 


42.3 


153 


tRNA-synt_ld 


tRNA synthetases class I (R) 


3 .Be-103 


356.1 


154 


Cyt_reductas 
e 


FAD/NAD -binding Cytochrome 
reductase 


7.8e-60 


212.2 


155 
157 


ras 

actin j 


Ras family 
'vctin 


3.6e-28 
3.6e-26 


107. 0 
37.1 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p -value 


PFAM 
SCORE 


158 


Jacalin 


Jacalin- like lectin domain 


0.09 


-24 . 9 


160 


2n_carbOpept 


Zinc carboxypeptidase 


5e-i38 


471.9 


165 


pkinase 


Eukaryotic protein kinase 
domain 


S.le-67 


236.1 


167 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.3e-07 


27 .0 


168 


Ribosomal_si 
5 


Ribosomal protein S15 


l.le-06 


29 .0 


169 


DEAD 


DEAD/DEAH box helicase 


le-48 


157.0 


171 


DUF59 


Domain of unknown function 
DUF59 


0.07 


-17.4 


172 


pkinase 


Eukaryotic protein kinase 
domain 


3 .7e-15 


58 .6 


173 


globin 


Globin 


4.6e-18 


67.4 


174 


WW 


WW domain 


7.3e-06 


32 .9 


175 


ras 


Ras family 


le-31 


118. 8 


178 


ATP1G1_PLM M 
ATS 


ATP1G1/PLM/MAT8 family 


2.Se-17 


71.0 


179 


2f-C2H2 


2inc finger, C2H2 type 


1.5e-99 


344.2 


180 


Clq 


Clq domain 


8.8e-72 


251. 9 


190 


Y^phosphatas 
e 


Protein- tyrosine phosphatase 


4 .Se-287 


967.0 


191 


efhand 


EF hand 


7.5e-l6 


66.1 


193 


pkinase 


Eukaryotic protein kinase 
domain 


6.5e-82 


285.6 


194 


bromodomain 


Bromodomain 


5.8e-3l 


111.4 


195 


PALP 


Pyridoxal -phosphate dependent 
enzyme 


2.5e-64 


227.1 


l£7 


DnaJ 


DnaJ domain 


1.6e-38 


141.4 


199 


RrnaAD 


Ribosomal RNA adenine 
dimethylases 


0.00018 


16.9 


200 


acid_phospha 
t 


Histidine acid phosphatase 


2.Se-10 


37 .2 


201 


WH2 


Wiskott Aldrich syndrome 
homology region 2 


0.00048 


26.9 


204 


vATP- 
synt_AC39 


ATP synthase (C/AC39) subunit 


1.3e-159 


543 .7 


205 


vATP- 
synt_AC3 9 


ATP synthase (C/AC39) subunit 


1.6e-139 


476.9 


206 


ldl_recept_a 


Low-deneity lipoprotein 
receptor domain 


2.4e-2S 


97.6 


209 


ank 


Ank repeat 


1.4e-19 


78.4 


210 


Rhomboid 


Rhomboid family 


0.0035 


1.2 


211 


Clq 


Clq domain 


1.6e-70 


247.7 


212 


UQ_con 


Ubiqui tin- conjugating enzyme 


7.4e-74 


258.8 


213 


UQ_con 


Ubiquit in -conjugating enzyme 


le-53 


19l':9 


215 


DEAD 


DEAD/DEAH box helicase 


1.8e-43 


140.4 


216 


PMP2 2_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


4.5e-21 


83 . 4 


218 


Glycos trans 
f 2 


Glycosyl transferases 


4e-21 


83.6 


219 


ig 


Immunoglobulin domain 


0.092 


10.7 


222 


WD40 


WD domain, G-beta repeat 


7.4e-23 


89.4 


224 


TPR 


TPR Domain 


l,2e~08 


42.1 


225 


DnaJ_CXXCXGX 
G 


DnaJ central domain (4 repeats) 


1.5e-38 


141.5 


226 


DnaJ_CXXCXGX 
G 


DnaJ central domain <4 repeats) 


1.5e-38 


141.5 


229 


HSP70 


Hsp70 protein 


2 .4e-54 


194.0 


230 


GSHPx 


Glutathione peroxidases 


3.4e-47 


170.2 


231 


tsp_l 


Thrombospondin type 1 domain 


0.0075 


17.1 


233 


cyclin 


Cyclin 


4 .6e-144 


492.0 


234 


ras 


Ras family 


4,8e-50 


179.7 


235 


LRR 


Leucine Rich Repeat 


1 .2e-30 


115 .3 


236 


LRR 


Leucine Rich Repeat 


6 .7e-29 


109 .4 


237 


PDZ 


PDZ domain (Also known as DHR 
or GIiGF) . 


1.7e-09 


45.0 
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SEQ ID 
NO: 


PFAM NAM 3 


DESCRIPTION 


p-value 


PFAM 
SCORE 


244 


dCMP_cyt_dea 
m 


Cytidine and deoxycytidylate 
deaminase 


2.5e-05 


31.1 


245 




Immunoglobulin domain 


6.7e-08 


30.5 


248 


wnt 


wnt family of developmental 
signaling protei 


9 .le-270 


742 .6 


250 


mtto_carr 


Mitochondrial carrier proteins 


1.3e-55 


193.6 


254 


adenylatekin 
ase 


Adenylate kinase 


1.8e-14 


55.7 


255 


Cation_ef flu 

X 


Cation efflux family 


2.8e-33 


124.0 


256 


SH3 


SH3 domain 


3 . 9e-14 


60 .4 


257 


Aa_trans 


Transmembrane amino acid 
transporter protein 


2.Se-S2 


187.2 


258 


adenylatekin 
ase 


Adenylate kinase 


2.1e-110 


380.2 


259 


HIT 


HIT family 


8.2e-07 


25.3 


2S0 


Bacterial PQ 
Q 


PQQ enzyme repeat 


1.6e-15 


65.0 


262 


proteasome 


Proteasome A- type and B-type 


S.5e-64 


225.7 


267 


pkinase 


Eukaryotic protein kinase 
domain 


6.3e-27 


101.0 


270 


filament 


Intermediate filament proteins 


3 .2e-150 


512.5 


271 


Choline Jtina 
se 


Choline/ethanolamine kinase 


2e-67 


237.4 


277 


Ribosomal S7 


Ribosomal protein S7p/S5e 


3 .3e-20 


80.6 


279 


pkinase 


Eukaryotic protein kinase 
domain 


3 .3e-77 


269.9 


280 


WD40 


WD domain, G-beta repeat 


7. Be-73 


255.4 


281 


WD40 


WD domain, G-beta repeat 


7.8e-73 


255.4 


284 


zf-DHHC 


DHHC zinc finger domain 


4 .6e-24 


93 .4 


287 


Exonuc lease 


Exonuclease 


1.4e-67 


238.0 


291 


SAM 


SAM domain (Sterile alpha 
motif) 


0 . 034 


11.2 


292 


SAM 


SAM domain (Sterile alpha 
motif) 


0.034 


11.2 


294 


Zf-C2H2 


Zinc finger, C2H2 type 


1.4e-29 


111.7 


295 


2f-C2H2 


Zinc finger, C2H2 type 


2.2e-125 


430.0 


296 


mito carr 


Mitochondrial carrier proteins 


4.1e-59 


2 05.5 


297 


HMG box 


HMG (high mobility group) box 


6.7e-29 


109.4 


302 


Glycos trans 
f 4 


Glycosyl transferase 


5e-87 


302.5 


3 04 


tRNA-synt_2 


tRNA synthetases class II (D, K 
and N) 


l.le-84 


294.8 


305 


KRAB 


KRAB box 


2e-44 


161.0 


306 


rrm 


RNA recognition motif. 


2.7e-44 


160 .6 


3 08 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


5.2e-39 


126.1 


309 


DNAjpolyme ra 
seX 


DNA polymerase X family 


2.4e-64 


227.2 


311 


F-box 


F-box domain. 


9.5e-08 


39.2 


3 12 


ig 


Immunoglobulin domain 


6.8e-19 


65.9 


*S *1 "3 


Bts 


Ets -domain 


8.1e-60 


192.3 


315 


Kelch 


Kelch motif 


I.3e-106 


367.6 


317 


arf 


ADP-ribosylation factor family 


3.2e-35 


130.4 


318 


sugar_tr 


Sugar (and other) transporter 


0.O003 


-73 .1 


320 


pkinase 


Eukaryotic protein kinase 
domain 


8.1e-83 


288.6 


322 


pkinase 


Eukaryotic protein kinase 
domain 


4.9e-81 


282 .6 


324 


Xlink 


Extracellular link domain 


4.5e-143 


331.5 


326 


ARID 


ARID DNA binding domain 


5.1e-37 


136.4 


327 


HMG box 


HMG (high mobility group) box 


6 .7e-29 


109 .4 


328 


cadherin 


Cadherin domain 


B.le-81 


281.9 


331 


chromo 


' chromo 1 ( CHRroma t in 
Organization Modifier) 


4e-18 


66. 7 


,„ 


Peptidase M2 
2 


Glycoprotease family 


1.2e-l36 


467.4 
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SEQ ID 
NOr 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


335 


vwa 


von Willebrand factor type A 
domain 


2.3e-07 


37.9 


339 


ras 


Ras family 


7.8e-07 


-59.1 


340 


zf -C2H2 


Zinc finger* C2H2 type 


8 .2e-64 


225.4 


3 42 


zf -C2H2 


Zinc finger, C2H2 type 


2.4e-85 


297.0 


343 


*S 


Immunoglobulin domain 


0.0005 


18.0 


346 


pkinase 


Eukaryotic protein kinase 
domain 


6.5e-65 


229.1 


347 


pkinase 


Eukaryotic protein kinase 
domain 


6.5e-65 


229.1 


351 


EGF 


EGF -like domain 


S.Se-20 


79.2 


352 


ank 


Ank repeat 


2.5e-10l 


350.0 


354 


TBC 


TBC domain 


S.ie-15 


63 .3 


355 


PHD 


PHD- finger 


3.2e-07 


37 . 4 


358 


DUP6 


Integral membrane protein DUF6 


0.033 


15.8 


359 


2f-C2H2 


Zinc finger, C2H2 type 


7.4e-20 


79.4 


361 


ank 


Ank repeat 


6 .6e-34 


126.1 


362 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


4.7e-53 


189.7 


363 


efhand 


EF hand 


5.4e-10 


46.6 


367 


LRR - 


Leucine Rich Repeat 


8. 8e-44 


158 .9 


368 


larainin G 


Laminin G domain 


1.5e-33 


121.7 


369 


PP2C 


Protein phosphatase 2C 


5.3e-20 


73. 9 


372 


LIM 


LIM domain containing proteins 


9.9e-15 


57. 1 


373 


KRAB 


KRAB box 


4 .8e-23 


90.0 . 


376 


ion_ trans 


Ion transport protein 


2.9e-09 


-4.2 ' 


377 


Beach 


Beige/ BEACH domain 


4.9e-208 


704.5 


380 


pkinase 


Eukaryotic protein kinase 
domain 


1.6e-94 


327.5 


381 


AMP-binding 


AMP- binding enzyme 


1.4e-07 


-140.3 


382 


HECT 


HECT- domain (ubiquitin- 
transf erase) . 


1.3e-07 


-13 .5 


384 


ank 


Ank repeat 


2.5e-101 


350,0 


386 




Immunoglobulin domain 


9.5e-0S 


23 . 6 


38B 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-42 


154.6 


389 




Immunoglobulin domain 


2.8e-15 


54 .3 


390 


mitq_carr 


Mitochondrial carrier proteins 


3.5e-67 


233 .2 


392 


TPR 


TPR Domain 


6.1e-17 


69.7 


393 


SH3 


SH3 domain 


3.5e-09 


43.9 


394 


AAA 


ATPases associated with various 
cellular act 


4.1e-2l 


83.6 


396 


spectrin 


Spectrin repeat 


2.1e-67 


237.3 


397 


zf-C2H2 


Zinc finger, C2H2 type 


0.0066 


23.1 


399 


fn3 


Fibronectin type III domain 


4.1e-102 


352.6 


400 


WD40 


WD domain, G-beta repeat 


0.00049 


26. 8 


401 


El^dehydrog 


Dehydrogenase El component 


3e-119 


409.6 


402 


£n3 


Fibronectin type III domain 


0 


1719.6 


404 


LRR 


Leucine Rich Repeat 


2 .le-10 


48. 0 


405 


cadherin 


Cadherin domain 


8.1e-81 


281.9 


406 


zf-CXXC 


CXXC zinc finger 


5e-15 


63.4 


410 


RhoGEF 


RhoGEF domain 


l.le-23 


92.1 


411 


F-box 


F-box domain. 


4.2e-06 


33.7 


412 


SNF2_N 


SNF2 and others N- terminal 
domain 


S.8e-16 


61.6 


415 


CPSase_L_cha 
in 


Carbamoyl -phosphate synthase 
(CPSase) 


1.5e-172 


586.6 


A1 Q 


LRR 


Leucine Rich Repeat 


3 ,8e-24 


93.6 


419 


DENN 


DENN CAEX-3) domain 


2e-58 


207.5 


420 


RasGEF 


RasGEF domain 


8 .le-43 


155.7 


421 


ank 


Ank repeat 


1.4e-153 


523 .7 


424 


G -patch 


G- patch domain 


le-19 


78.9 


425 


pkinase 


Eukaryotic protein kinase 
domain 


2.2e-31 


117.1 


426 


Plexin repea 
t 


Plexin repeat 


0.0023 


24 .6 


427 


Plexin_ repea 


Plexin repeat 


0.0023 


24.6 
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C 








429 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


8.6e-ll 


39.2 


431 


DEAD 


DEAD/DEAH box helicase 


le-66 


214.0 


432 


SH3 


SH3 domain 


3.4e-16 


67. 2 


433 


GTP CDC 


Cell division protein 


2.1e-114 


393.5 


436 


Collagen 


Collagen triple helix repeat 
(20 copies) 


4.6e-194 


658.1 


438 


RicinJB lect 
in 


Similarity to lectin domain of 
ricin b 


0.0085 


10.5 


441 


Alpha__a dap t A 
n_C 


Alpha adapt in carboxyl -terminal 
domai 


1.2e-256 


866.0 


442 


Alpha_adapti 
n_C 


Alpha adapt in carboxyl -terminal 
domai 


1.8e-235 


795.7 


443 


PDZ 


PDZ domain (Also known as DHR 
or GLGP) . 


1.9S-6S 


230.9 


445 


LON 


ATP -dependent protease La (LON) 
domain 


0.00012 


-17.1 


446 


ig 


Immunoglobulin domain 


0.00011 


20.1 


,451 


sushi 


Sushi domain (SCR repeat) 


1 . 4e-18 


75.2 


452 


fn3 


Fibronectin type III domain 


1 .Se-06 


35.2 


454 


pyridoxal_de 
C 


Pyridoxal -dependent 
decarboxylase conse 


8 ,3e-14 


50.3 


456 


kinesin 


Kinesin motor domain 


4 . 96-217 


734.4 


457 


neur_chan 


Neurotransmitter-gated ion- 
channel 


le-175 


597.1 


458 


Josephin 


Josephin 


0.0002 


18.7 


468 


bZIP 


bZIP transcription factor 


1.7e-07 


31.8 


470 


NTP_transfer 
ase 


Nucleotidyl transferase 


6.3e-06 


-26.3 


471 


WD40 


WD domain, G-beta repeat 


2e-28 


107.9 


473 


LIM 


LIM domain containing proteins 


0.00021 


20.7 


477 


zf-RanBP 


Zn- finger in Ran binding 
protein and others. 


0.028 


21.0 


479 


WD40 


WD domain, G-beta repeat 


6 .5e-18 


73 . 0 


480 


KRAB 


KRAB box 


le-31 


118.8 


| 481 


ArfGap 


Putative GTP -ase activating 
protein for Arf 


8.4e-66 


232.0 


} 485 


SH2 


Src homology domain 2 


0.011 


11 .4 


486 


Clg 


Clq domain 


4.3e-74 


259.6 


487 


dsrm 


Double- stranded RNA binding 
motif 


l.le-47 


171. 9 


4B9 


zf-C2H2 


Zinc finger, C2E2 type 


4.8e-153 


521.9 


490 


Alpha_adapti 
n_C 


Alpha adapt in carboxyl -terminal 
domai 


3.4e-222 


751.6 


492 


SKI 


ShiJcimate kinase 


1.2e-10 


48.8 


497 


BNV_jpolyprot 
ein 


ENV polyprotein (coat 
polyp rotein) 


2.6e-22 


77 . 6 


498 


abhydrolase_ 
2 


Phosphol ipase/ Carboxyl es t erase 


0.041 


-48.1 


500 


rrm 


RNA recognition motif. 


5.4e-34 


126 .4 


cm 


WW 


WW domain 


4.6e-l8 


73.4 


502 


*g 


Immunoglobulin domain 


l.le-10 


39.5 


504 


abhydrolase 


alpha/beta hydrolase fold 


0.045 


-3.6 


505 


vwa 


von Willehrand factor type A 
domain 


7.1e-62 


219 . 0 


508 


Na_KATPase_ 


Na+/K+ ATPase C- terminus 


2.3e-145 


496.3 


509 


Exonuc lease 


Exonuc lease 


1.3e-56 


201.5 


510 


Glycos trans 
f_l 


Glycosyl transferases group 1 


2.96-06 


27.0 


511 


Glycos trans 
f_l 


Glycosyl transferases group 1 


2.9e-06 


27.0 


512 


Glycos trans 
f_l 


Glycosyl transferases group 1 


1.9S-09 


38.5 


514 


pro_isomeras 
e 


Cyclophilin type pep t idyl - 
prolyl cia-tr 


1.8e-63 


221.4 ■ 
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515 


EGF 


EGF- like domain 


1.9e-18 


74 .7 


516 


Surp 


Surp module 


4.3e-38 


140.0 


523 


*3 


Immunoglobulin domain 


3.3e-06 


25 .0 


526 


UBX 


UBX domain 


l.le-34 


128 .6 


528 


adh_zinc 


Zinc -binding dehydrogenases 


2.7e-34 


127.4 


530 


SAM 


SAM domain (Sterile alpha 
motif) 


0.046 


10.0 


531 


adh_short 


short chain dehydrogenase 


0.0025 


-34.1 


532 


mito__carr 


Mitochondrial carrier proteins 


2.5e-BI. 


281 .7" 


533 


mito_carr 


Mitochondrial carrier proteins 


2e-61 


213 .5 


534 


thiolase 


Thiolase 


3.5e-183 


622 .0 


535 


FMO-like 


Flavin -binding monooxygenase- 
like 


0 


1153.7 


536 


SCAN 


SCAN domain 


4e-55 


196.6 


53 7 


tRNA-synt_X 


tRNA synthetases class I (I, L, 
M and V) 


3.1e-Z3£ 


466 .0 


53 8 


tRNA-synt_l 


tRNA synthetases class I (I , L, 
M and V) 


3 .le-136 


466 .0 


539 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


1.9e-117 


403.6 


540 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


3.1e-136 


466,0 


541 


vATP-synt_S 


ATP synthase (E/31 kDa> subunit 


5.9e-B5 


295.7 


543 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-6"9 


242.6 


544 


DUF101 


Protein of unknown function 
DUF101 


8. 5e-38 


139 .0 


545 


TGFbjpropept 
ide 


TGF-beta propeptide 


1 .le-67 


238 .2 


547 


WD4 0 


WD domain, G-beta repeat 


2.6e-32 


120.8 


548 


RHD 


Rel homology domain (RHD) . 


.1.6e-238 


68S.2 


549 


MMR HSR1 


GTPase of unknown function 


5.4e-67 


236 .0 


551 


HECT 


HECT-domain (ubiguxtin- 
transf erase) . 


4.3e-127 


435 .6 


554 


MHC_II_alpha 


Class II histocompatibility 
antigen, alp 


3 .5e-74 


259.8 


555 


Zf-UBRl 


Putative zinc finger in N- 
recognin 


3.3e-16 


67.3 


556 


Kelch 


Kelch motif 


5.5e-29 


109.7 


561 


AMP -binding 


AMP-binding enzyme 


2.8e-06 


-163.7 


5S2 


PABP 


Poly- adenylate binding protein, 
unique domai 


4.9e-38 


139.8 


564 


Gag_p3 0 


Gag P30 core shell protein 


1.2e-67 


238.2 


566 


PWWP 


PWWP domain 


8.1e-l6 


66.0 


567 


SCAN 


scan domain 


7.3e-68 


238.9 


569 


pkinase 


Eukaryotic protein kinase 
domain 


1.5e-84 


294.3 


570 


pkinase 


Bukaryotic protein kinase 
domain 


1.5e-84 


294.3 


571 


CN_hydrolase 


carbon-nitrogen hydrolase 


0.00081 


-79.7 


572 


myasin__head 


Myosin head (motor domain) 


0 


1495.2 


573 


rayosin_head 


Myosin head (motor domain) 


0 


1490.4 


575 


Surp 


Surp module 


1.7e-23 


91. S 


576 


Surp 


Surp module 


1.7e-23 


91.5 


577 


DNAjpOl_B 


DNA polymerase family B 


0 


1138.6 


57B 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


B.3e-09 


42.7 




LRR 


Leucine Rich Repeat 


4 .9e-21 


83 .3 


580 


neur_chan 


Neurotransmitter- gated ion- 
channel 


5.9e-l77 


601.3 


583 


sushi 


Sushi domain (SCR repeat) 


0 


1673.0 


584 


DEAD 


DEAD/DEAH box helicase 


7.3e-36 


116.3 


586 


KH- domain 


KH domain 


2.9e-13 


57 .5 


587 


G-patch 


G-patch domain 


2.3e-14 


61.2 


589 


tilM 


LIM domain containing proteins 


2.3e-36 


133.4 


590 


bromodomain 


Bromodomain 


6.6e-32 


114. 7 


591 


bromodomain 


Bromodomain 


6.6e-32 


114.7 
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592 


hormone_r e c 


Ligand- binding domain of 
nuclear hormone 


3 .5e-22 


87 . 1 


593 


PHD 


PHD- finger 


3 .8e-12 


53 .8 


594 


cadherin 


Cadherin domain 


4.2e-99 


342 .7 


596 


pkinase 


Eukaryotic protein kinase 
domain 


5e-92 


319 .2 


591 " 


WD40 


WD domain, G-beta repeat 


0.00054 


26.7 


600 


FG-GAP 


FG-GAP repeat 


4.3e-75 


262.9 


602 


G_Adapt_CT 


Gamma- adapt in, C- terminus 


l.le-53 


191.8 


603 


pkinase 


Eukaryotic protein kinase 
domain 


2.3e-86 


300 .4 


605 


Collagen 


Collagen triple helix repeat 
(20 copies) 


8e-42 


152.4 


606 


mito_carr 


Mitochondrial carrier proteins 


6.3e-67 


232.3 . 


608 


PWWP 


PWWP domain 


2.6e-2B 


107.5 


609 


PWWP 


PWWP domain 


2.6e-28 


107.5 


613 


CAPJ3LY 


CAP-Gly domain 


0.0046 


20 .1 


615 


RFX_DNA_bind 
ing 


RFX DNA- binding domain 
t 


5. 2e-54 


192 .9 


616 


kinesin 


Kinesin motor domain 


i.ie-ai 


284.8 


617 


kinesin 


Kinesin motor domain 


8 . 4e-S0 


278 . 5 


618 


zf-C3HC4 


Zinc finger, C3HC4 type <RING 
finger) 


0 . 0098 


13 .1 


620 


MATH 


MATH domain 


7 . Be-05 


22 . 2 


621 


Y_phospha t as 
e 


Protein- tyrosine phosphatase 


1 . 4e-32 


121 . 6 


622 


pkinase 


Eukaryotic protein kinase 
domain 


4 1 4e-40 


146 . 6 


623 


BNR 


BNR repeat 


2 . le-li 


51 . 3 


624 


raolybdopteri 
n 


Prokaryotic molybdopterin 
oxidoreductas 


1 . 4e-12 


42 . 2 


625 


TPR 


TPR Domain 


1 , le-17 


72 . 2 


627 


cNMP binding' 


Cyclic nucleotide - binding 
domain 


3 • 7e-58 


206 . 6 


630 


adh_short 


short chain dehydrogenase 


5e-17 


70.0 


631 


zf -C2H2 


Zinc finger, C2H2 type 


2 . le-88 


307 . 1 


632 


rrm 


RNA recognition motif . 


4e-05 


30.5 


635 


pkinase 


Eukaryotic protein kinase 
domain 


1 Ca.l fid 


360.7 


636 


Fork_head 


Fork head domain 


5 . 9e-27 


103 . 0 


637 


pkinase 


Eukaryotic protein kinase 
domain 


3 . 8e-70 


246 .5 


642 


TPR 


TPR Domain 


4 . 8e-03 


40. 1 


643 


ef hand 


EF hand 


l.9e-27 


104.6 


647 


SKF2_N 


SMF2 and others N- terminal 
domain 


1.2e-101 


351.1 


64 a 


PseudoU synt 
h_2 


RNA pseudouridylate synthase 


1.9e-55 


197.6 


650 


zf-C2H2 


Zinc finger, C2H2 type 


0*0087 


22.7 


651 


ank 


Ank repeat 


1.3e-i7' 


71.9 


652 


I_LWEQ 


I/LWEQ domain 


9.5e-101 


341.0 


653 


neur^chan 


Neurotransmitter-gated ion- 
channel 


4.1e-171 


581.8 


654 


tsp_i 


Thrombospondin type l domain 


4 .le-47 


169.9 


659 


FH2 


Formin Homology 2 Domain 


le-107 


371.2 


661 


pou 


Pou domain - N- terminal to 
homeobox domain 


5.3e-45 


162.9 


662 


C2 


C2 domain 


6.7e-19 


76 .2 


663 


C2 


C2 domain 


6.7e-19 


76.2 


664 


C2 


C2 domain 


6.7e-19 


76 .2 


667 


GST 


Glutathione S- transferases. 


9.3e-34 


114 .4 


668 


LRR 


Leucine Rich Repeat 


9.3e-3l 


115.6 


670 


spectrin 


Spectrin repeat 


4e-57 


203.2 


671 


I_LWEQ 


I/LWEQ domain 


9.5e-101 


341. 0 


672 


ABC tran 


ABC transporter 


5.3e-60 


212 .8 


674 


WD40 


WD domain, G-beta repeat 


4.8e-24 


93.3 
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675 


WD40 


WD domain, G-beta repeat 


4 .8e-24 


93.3 


676 


IiRR 


Leucine Rich Repeat 


0.0015 


25,2 


679 

tiar, 


zf-CCCH 


Zinc finger C-x8-C-x5~C-x3-H 
type 


2.6e-29 


107.7 


DOU 


Z I - t-2H2 


Zinc finger, C2H2 type 


5.2e-05 


30.1 


S81 


CH 


Calponin homology (CH) domain 


2.4e-17 


71.1 


682 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


4 -3e-43 


156 .6 




z £ - C3HC4 

.. .. 


Zinc finger, C3HC4 type (RING 
finger) 


0.O51 


10.8 


DO / 


Synapsin 


Synapsin 


0 


1890.8 


689 

- 


PR55 


Protein phosphatase 2 A 
regulatory eubunit PR 


0 


1038.8 


691 


homeobox 


Homeobox domain 


8.5e-30 


112 .4 


696 


Peptidase_M2 
4 


metallopeptidase family M24 


2 .6e-59 


210 .5 


697 


RhoGEF 


RhoGEF domain 


9.5e-35 


128.9 


698 


PHD 


PHD- finger 


0.008 


9.3 


701 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-123 


422.0 


702 


Sulf atase 


Sulfatase 


3e-231 


781.6 


703 


zf-C2H2 


Zinc finger, C2H2 type 


5.7e-20 


79.8 


707 


Acyl_transf 


Acyl transferase domain 


l-le-22 


88.8 


708 


WD4 0 


WD domain, G-beta repeat 


4 .8e-l9 


76. 7 


710 


Ran_BPl 


RanBPl domain. 


8 .4e-06 


-7.3 


713 


DBAD 


DEAD/DEAH box helicase 


9 .9e-42 


134.9 


714 


PK 


PH domain 


1 .6e-09 


39.0 


715 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


1.5e-37 


138.2 


717 


Sialyl trans f 


Sialyl transferase family 


7.se-3i 


115.9 


718 




Immunoglobulin domain 


le-29 


100.8 


719 


integrin_B 


Integnns, beta chain 


0 


1125.4 


720 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


l.le-08 


32.4 


722 


Peptidase__C2 


Calpain family cysteine 
protease 


3e-145 


495.9 


723 


ig 


Immunoglobulin domain 


2.2e-05 


22.4 


"724 


F-box 


F-box domain. 


0.007 


23 .0 


725 


Nop 


Putative snoRNA binding domain 


a.ie-58 


205.5 


726 


Nop 


Putative snoRNA binding domain 


8.1e-58 


205.5 


727 


WD40 


WD domain, G-beta repeat 


7.5e-26 


99.3 


730 


dsrm 


Double- stranded RNA binding 
motif 


0.027 


12.1 


731 


dynamin 


Dynamin family 


4.2e-16 


66.9 


733 


Zf-CCCH 


Zinc finger C-x8-c-x5-C-x3-H 
type 


2.8e-10 


41.7 


735 


CDP- 

OH P transf 


CDP- alcohol 

phosphatidyl transferase 


4.2e-26 


100.1 


/JO 


DEAD 


DEAD/DEAH box helicase 


B.6e-57 


182.5 


739 


TSC22 


TSC-22/dip/bun family 


6.5e-32 


119.5 


742 


ras 


Ras family 


2.2e-100 


346.9 


743 


PMI_typeI 


Phosphomannosc i some rase type I 


1.2e-243 


822.9 


747 


trypsin 


Trypsin 


6.4e-88 


279.4 


748 


kazal 


Kazal -type serine protease 
inhibitor domain 


2.2e-52 


187.4 


749 


exhand 


EF hand 


6.3e-06 


33-1 


751 


PHD 


PHD- finger 


4.9e-16 


66.7 


752 


zf -C2H2 


ainc rmger, (_2n2 type 


3 .2e-2l 


83 . 9 


753 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


6.1e-il 


49. 8 


754 


Ribosomai 1»3 
9 


Ribosomai L39 protein 


0.00018 


26 . 7 


755 


PH 


PH domain 


3 .6e-14 


55.7 


758 


SCAN 


SCAN domain 


1.46-53 


191.5 


759 


PA 


PA domain 


0.0065 


23.1 


760 




ADP-ribosylation factor family 


2.2e-l9 


77.8 


761 


CIDE-N 


CIDE-N domain 


2.2e-40 


147.6 
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762 


hi st one 


Core histone H2A/H2B/H3/H4 


9.9e-53 


188.6 


763 


zf-MYND 


MYND finger 


4.1e-14 


60.3 


764 


pou 


Pou domain - N- terminal to 
homeobox domain 


le-52 


188.6 


767 


vwc 


von Willebrand factor type C 
domain 


2.9e-34 


127.3 


769 


ef hand 


EF hand 


4 .8e-ll 


50.1 


770 


z£-C4 


Zinc finger, C4 type (two 
domains) 


2 ,4e-53 


181.6 


772 


ras 


Has family 


7e-90 


312.0 


773 


Sulfatase 


Sulfatase 


le-142 


487.5 


775 


2f-C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 


776 


zf-C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 


777 


zf-C2H2 


zinc finger, C2H2 type 


l.le-12 


55.5 


778 


rrm 


RNA recognition motif. 


2.1e-32 


121.1 


} 779 

j 


G6PD 


Glucose- 6 -phosphate 
dehydrogenase 


1.5e-76 


236.6 


■ 780 


spectrin 


Spectrin repeat 


3 . 7e-29 


110.3 


781 


mi to carr 


Mitochondrial carrier proteins 


4.6e-57 


198.5 


782 


SCAN 


SCAN domain 


1.3e-24 


95.2 


783 


PDZ 


PDZ domain (Also known as DHR 
or GZiGF) . 


4 . le-07 


37.1 


785 


DEAD 


DEAD/DEAH box helicase 


6e-Q6 


21 . 7 


786 


ras 


Ras family 


5.3e-39 


143.0 


787 


RNase HI I 


Ribonuclcaae HII 


2.5e-67 


237.1 


790 


PI3_PI4_kina 
se 


Phosphatidyl inositol 3- and 4- 
kinases 1 


5.4e-108 


372.2 


795 


cadherin 


Cadherin domain 


2.5e-40 


147.4 


796 


ARID 


ARID DNA binding domain 


1.6e-20 


81.6 


797 


trypsin 


Trypsin 


9.9e-20 


64 . 8 


799 


CH 


Calponin homology (CH) domain 


3 .7e-15 


63. 8 


801 


Gal- 
bind lectin 


Vertebrate galactoslde-binding 
lectin 


4.1e-25 


88.7 


803 


WD40 


WD domain, G-beta repeat 


0.00082 


26. 1 


806 


TBC 


TBC domain 


1.8e-26 


101.4 


807 


TBC 


TBC domain 


1.8e-26 


101.4 


808 


CNjhyclrolase 


Carbon- nitrogen hydrolase 


8-8e-30 


278 .5 


811 


CBFD . NFYB HM 
F 


His tone- like transcription 
factor 


6e-14 


59.8 


812 


adh short 


short chain dehydrogenase 


8.1e-20 


79.3 


814 


IMP4 


Domain of unknown function 


3.3e-71 


250.0 


815 


SE-C2H2 


Zinc finger, C2H2 type 


8.2S-66 


232.1 


816 


Pept__tRNA_hy 
dro 


Peptidyl-tRNA hydrolase 


1.6e-37 


138.0 


817 


ARID 


ARID DNA binding domain 


2.5e-lB 


74.3 


826 


IFS_eIF4 elF 
2 


eIF4 - gamma/ eIF5/e IF2- epsil on 


1.6S-32 


121.5 


830 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


1.5e-53 


191 .3 


831 


LRR 


Leucine Rich Repeat 


2.le-26 


101 .1 


832 


laminin_EGF 


Laminin EGF-like {Domains III 
and V) 


2e-57 


204.2 


839 


rrm 


RNA recognition motif. 


1.3e-22 


88.5 


840 


Y_phosphatas 
e 


Protein -tyrosine phosphatase 


2.60-119 


409 !8 


841 


pkinase 


Eukaryotic protein kinase 
domain 


3.4e-100 


346 .3 


B44 


Ribosomal L2 
2a 


Ribosomal L22e protein family 


le-64 


228 .4 


846 


IBR 


I BR domain 


9e-15 


62.5 


349 


z£>C3HC4 


Zinc finger, C3HC4 type (RING 
finger} 


7.4e-07 


26.5 


850 


zr-C3HC4 


Zinc finger, C3HC4 type (RING 
finger} 


0.00016 


18.9 


851 


SET 


SET domain 


5e-30 


113.2 


852 


SRCR 


Scavenger receptor cysteine - 


0 


1025.4 
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SEQ ID 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 






rich domain 






853 


QUPP 


Scavenger receptor cysteine- 
rich domain 


0 


1025.4 


857 




Metallo-beta- lactamase 


0.012 


-6.0 


B58 


C0X6A 


Cytochrome c oxidase subunit 

Via 


3.4e-58 


206.7 


B59 


rrm 


RNA recognition motif. 


5.4e-45 


162 .9 


861 


PRK 


Phosphor ibu 1 okinase 


5.le-62 


219.4 


863 


mito_carr 


Mitochondrial carrier proteins 


2.9e-53 


185.5 


864 


HSP90 ™ 


HspSO protein 


4.7e-158 


538.5 


866 


iSF 


Immunoglobulin domain 


4e-12 


44.1 


867 


z£-C2H2 


Zinc finger, C2H2 type 


7e-135 


461.5 


872 


hi stone 


Core hlstone H2A/H2B/H3/H4 


4 ,9e-41 


149.8 


874 


cpSase_L__cha 
in 


Carbamoyl -phosphate synthase 
(CPSase) 


2 .le-218 


739.0 


879 


Ribosomal_Sl 
2e 


Ribosomal protein S12e 


2.1e-98 


340.3 


882 


serpin 


Serpins (serine protease 
inhibitors) 


2.5e-42 


14S.7 


883 


Patatin 


Patatin 


1.2e-Sl 


182 . 0 


884 


RA 


Ras association (RalGDS/AF-6) 
domain 


0.044 


8.0 


887 


DUF92 


Integral membrane protein DUF92 


2 .7e-12 


S4 .3 


88 9 
-q-q-a~ 


sugar__tr 


Sugar (and other) transporter 


8 .2e-63 


222 .1 


893 

' "o'a"£' 


DUF28 


Domain of unknown function 
DUF28 


1.3e-43 


158.3 


896 


IP_trans 

- 


Phosphatidyl inositol transfer 
protein 


6.5e-98 


338.7 


898 


DEAD 


DEAD/DEAH box helicase 


l.Se-48 


156.5 


899 


KE2 


KE2 family protein 


7e-61 


21S.7 


900 


KE2 


KE2 family protein 


4.3e-Sl 


183 .2 




zr-C2H2 


Zinc finger, C2H2 type 


2.7e-57 


203 .8 


902 


ras 


Ras family 


2.3e-75 


263 .8 


904 


TPR 


TPR Domain 


3.2e-22 


87.2 




GBP 


Guanylate -binding protein 


f8.9e-253 


853.1 


907 


GBP 


Guanylate -binding protein 


l.le-239 


809.6 


908 


WD 4 0 


WD domain, G-beta repeat 


2.6e-26 


100 .8 


909 


PH 


PH domain 


1.3e-09 


39.4 


9X0 


21 -C2H2 


Zinc finger, C2H2 type 


2.5e-39 


144.1 


913 


Epimerase 


NAD dependent 

epimerase/dehydratase family 


5e-07 


-88.5 


921 " 


TBC 


TBC domain 


l.Se-09 


30.7 


922 


WD40 


wd domain, G-beta repeat 


1.6e-25 


98.2 




WD40 


WD domain, G-beta repeat 


8.2e-07 


36.1 


924 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


2.9B-05 


29.1 




UQ^con 


Ubiqui tin -conjugating enzyme 


0.60033 


-27.6 


926 


CH 


caiponin homology (CH) domain 


3.3e-53 


190.2 


928 


WD40 


WD domain, G-beta repeat 


5.9e-48 


172.7 


929 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


3.1e-10 


37.4 


930 


Ribul_P_3_ep 
im 


Ribulose-phosptiate 3 epimerase 
family 


7.2e-105 


361.8 




Ribul P 3 ep 
im 


Ribulose-phosphate 3 epimerase 
family 


1.2e-96 


334.4 


936 


C2 




2 . 2e-62 


220 . 7 


937 


NAP_family 


Nueieosome assembly protein 
(WAP) 


1 .le-22 


84. £ 


940 


abhydrolase 


alpha /bet a hydrolase fold 


0 .011 


3.1 


944 


Tropomyosin 


Tropomyosins 


3 ,2e-07 


25.1 


948 


pkinase 


EuJcaryotic protein kinase 
domain 


3 .4e-75 


263 .2 


949 


WD40 


WD domain, G-beta repeat 


1.8e-27 


104 .7 


9S0 


Acyltransfer 
ase 


Acyitransf erase 


1.6e-07 


38.4 



256 



WO 01/53312 



PCT/US00/34263 



SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


n— value 


PFAM 
SCORE 


951 


SAM 


SAM domain (Sterile alpha 
motif) 


0 . 014 


14.5 


954 


GPO IDH MocA 


Oxidoreductase family 


1.3e-ll 


52.0 


955 


BTB 


BTB/POZ domain 


7e-22 


86 .1 


956 


BTB 


BTB/P03 domain 


7e-22 


86 .1 


957 


CDP- 

OH P transf 


CDF- alcohol 

phosphatidyl transferase 


0.O53 


-22.2 


959 


ras 


Ras family 


2.4e-97 


336 . 8 


960 


ras 


Ras family 


8.4e-43 


155.6 


961 


Acetyl trans f 


Acetyl transferase (GNAT) family 


1.2e-08 


42 .2 


962 


adh short 


short chain dehydrogenase 


2.4e-31 


117 . 6 


963 


mutT 


Bacterial mutt protein 


5.6e-06 


26.2 


969 


IF-2B 


Tnitiahifiii facfcoir 2 QiiKn nit - 

family 


8 . 4e-l93 


653 . 9 


970 


RNase_PH 


3 ' exoribonuclease family 


9e-24 


92.4 


975 


WW 


WW domain 


5. 7e-25 


96 .4 


977 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


3 . 6e-21 


83 , 7 


978 


Ribosomal_Li 

7 


Ribosomal protein LI 7 


2.4e-20 


81,0 


979 


LIM 


LIM domain containing proteins 


5.8e-42 


152.8 


980 


Calsequestri 
ii 


Calsequestrin 


1.7e-297 


1001.7 


982 


HSP20 


Hsp20 /alpha crystallin family 


l . 2e-l0 


43 .2 


963 


oxidored_q6 


NADH ubiquinone oxidoreductase, 
20 Kd. sub 


4.8e-63 


222.9 • 


988 


TBC 


TBC domain 


2.2e-50 


180.8 




TBC 


TBC doma i n 


A . 4, B"3U 


160 ■ 8 




t- PTJR -in*- ani^ 

u K2u%_ i ii t e n w 


tRNA intron endonuclease 


u . U UJ. / 


-id" "o " 


QQfl 


n OTTlG ob ox 


Home obox doma in 


4 Q - 18 


rJ . o 


997 


py r_r e dox 


Pyridine nucleotide~*di sulphide 
ox.3- do r £ du c ta 


0 . 012 


11 . 6 


1000 


mi' hrt r»s» vr* 
U1XUU UaLl 


imx L.oc.iiuiica.jricix earner proteins 


9 . 7e-123 


421 .2 


1001 


RA 


Ras association (RalGDS/AF-6) 

UvUlci XII 


1.2e-15 


65.4 


1004 


• DUF81 


DUF81 


0 . 099 


10.2 


1005 


act in 


Actin 


1 . 3e-l74 


574.3 


1006 


actin 


Actin 


3 . le-130 


428 .6 


1007 


Cpn60 TCP1 


TCP— l/cpn60 chaperonin family 


3 . 7e-195 


661 . 8 


1008 


TPR 


TPR Domain 


g , i e _44 


159 . 0 


1009 


zf-C2H2 


Zinc finger, C2H2 type 


3.6e-61 


216.6 


1011 


zf-C2H2 


Zinc finger, C2H2 type 


3 . 6e-61 


216 . 6 


1012 


Zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


4.7e-15 


53.1 


1016 


tRNA-eynt — 2 c 


tRNA synthetases class II (A) 


2.3e-15 


55.2 


1018 


RhoGAP 


RhoGAP domain 


1.6e-78 


274.3 


1022 


PGAM 


Phosphoglycerate mutase family 


3 .8e-18 


69.7 


1026 


HMG_box 


HMG (high mobility group) box 


8.4e-20 


79.2 


1027 


TBC 


TBC domain 


7.3e-45 


162.5 


1028 


UQ_con 


Dbiqui tin- conjugating enzyme 


1.4e-49 


178.1 


1032 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


0.028 


16.3 


1034 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


2e-21 


84 .6 


1037 


KRAB 


KRAB box 


4.8e-06 


32.4 


1038 


Cation_eff lu 

X 


Cation efflux family 


7.1e-42 


152.5 


1040 


ART 


NADtarginine ADF- 
r i bo sy 1 1 rans fe ra se 


4.7e-47 


169.1 


1042 


WD40 


WD domain, G-beta repeat 


I.9e-18 


74.7 


1043 


zf-C2H2 


Zinc finger, C2H2 type 


3.7e-24 


93 . 7 


1045 


lectin c 


Lectin C- type domain 


1.9e-28 


108 .0 


1046 


Glucosamine_ 
iso 


Glucosamine- 6 -phosphate 
isomerase 


0.00013 


-25.1 
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SEQ ID 
NO; 






p-value 


PFAM 
SCORE 


1047 


ligase-CoA 


CoA-ligases 


4.5e-ao 


279.4 


1043 


1 9 


Immunoglobulin domain 


1 ,7e-09 


35.6 


1050 


Ribosomal L2 
4e ~~ 


Ribosomal protein L24e 


2e-33 


124.5 


1054 


Amidase 


Amidase 


4-3e-152 


518.7 


1055 


rrrn 


RNA recognition motif. 


3.8e-26 


100.3 


1058 


annexin 


Annexin 


6.9e-44 


159.2 


1059 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


0.023 


-23.6 


1060 


homeobox 


Homeobox domain 


3 .2e-31 


117.2 


1062 


Acyl trans fer 
ase 


Acyl transferase 


0,00065 


10.5 


1064 


AMP- binding 


AMP-binding enzyme 


6.6e-l00 


345.3 


1065 


LRR 


Leucine Rich Repeat 


3 .3e-14 


60.6 


1066 


GTP1 OBG 


GTPl/OBG family 


4 .3e-41 


141.8 


1071 


a 9 


Immunoglobulin domain 


8 ,4e-48 


159.1 


1072 


PHD 


PHD- finger 


6.8e-07 


36.3 


1074 


DENN 




8 . 3e-33 


121 . 5 


1075 


SCP 


SCP-like extracellular protein 


4 .7e-41 


149.8 


1077 


OLF 


01 fact omeclin- like domain 


2 • 2©-*66 


234 . 0 


1078 


mito carr 


Mitochondrial carrier proteins 


le-42 


149.3 


1079 


WD4 0 


WD domain, G-beta repeat 


6 . 2e-4S 


1 CO 1 


10G7 


START 


START domain 


1 . 5e-48 


174 7 


1093 


DSPC 


Dual specificity phosphatase, 
catalytic donta 


3.3e-63 


223 .4 


1094 


GSHPx 


Glutathione peroxidases 


9 1 6e-41 


1 a a fl 


1095 


DUF25 


Domain of unknown function 
DUF25 


2e-75 


264.0 


1096 


.DUF25 


Domain of unknown function 
DUF2 5 


6e- 75 


"262.4 


1105 


Nitroreducta 

S3 


Nitroreductase family 


I.3e-13 


58.6 


1106 


PTE 


Phosphotriesterase family 


1 .3e-179 


610.1 


1107 


DAGXc 


Diacylglycerol kinase catalytic 
doma In 




19 . 6 


1109 


ras 


Ras family 


1 .3e-15 


40 . 7 


1115 


ArfG&p 


rULaCXVe (jir aSe clvClvatllig 


9 ^ 7 e -47 


168 .7 


J. J, AO 




HIVV51 A. and WMG17 


4 ,4e-21 


83 . 5 


1117 




HMG14 and HMG17 


9 . 9e-12 


52 .4 


1113 


J? ^ flyCLTO XaS 

e 


hydrolase fam 


2e-83 


290 .6 


lion 

XX^U 


*J M =1 a x> 


RnVarvAt*ip nmhAi Vi kinase 
domain 


1 ,4e-94 


327.6 


1123 


saVilrvrlT^fO sea 
CUJXijr UlTDXa a C 


sil nha /hf»f 3 hvdrnlase fold 


9.2e-23 


89. 0 


XXZ 27 


e 


Pvclr^rihi 1 in hvnp Tientidvl- 

prolyl cis-tr 


2 ,2e-56 


197.1 


1111 






1 ,6e-30 


114 .9 


1132 


WD40 


WD domain, G-beta repeat 


1.3e-19 


78 , 6 


1133 


WD40 


wjj aomaxn, u'usta icpeaL 


1 . 8e-15 


64 . 9 


1 13 4 


irn 


fri uomaxn 


0 , 0015 


17 . 8 


1136 


Adap comp su 
b 


Adaptor complexes medium 
subunit family 


1 ,2e-256 


866 . 0 


XX J l 


7\ **1 yr\ /^TT>t~\ C!l 

0 


/iuayLur coiupjicACB iiitsuxutn 

SuOUUlL JLdlllJ — Ly 


2 . 5e-209 


708 .8 


1139 


ras 


Ras family 


1.5e-86 


301.0 


1141 


pkinase 


Eukaryotic protein kinase 
domain 


9.4e-74 


258.4 


1152 


Acyl transfer 
ase 


Acyl transferase 


1.2e-05 


29.9 


1153 


IRS 


PTB domain (IRS-l.type) 


5.4e-55 


196 .1 


1155 




Immunoglobulin domain 


1.3e-3l 


106.9 


1157 


Aspa rag ina s e 
2 


Asparaginase 


6.4e-72 


252.3 


1159 


GMC_oxred 


GMC oxidoreductases 


4.7e-142 


485.3 


1160 


zf-ANl 


ANl-like 2inc finger 


0.00021 


27.9 
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MA . 
Vi\J . 


rrAM XVAJNCj 




p- value 


PFAM 
SCORE 


1163 


1 -i nVar Vt -5 ehn 


linker hi stone Hi and HS family 


3 .8e-14 


60 .4 


1164 


DED 


Death effector domain 


3.9e-05 


30 .5 


1165 


IRS 


PTB domain (IRS-1 type) 


2.6e-43 


157.3 


11 66 


IRS 


PTB domain (IRS-l type) 


2.6e-43 


157.3 


X1D0 


SAM 


SAM domain (Sterile alpha 
mot i f ) 


0 .04 


10.5 


1170 




alpha/beta hydrolase fold 


0.098 


-7.S 


1174 


SAP 


SAP domain 


3 .9e-l0 


47 .1 


1177 


PP2C 


Protein phosphatase 2C 


5. 3e-31 


112 ,5 


J. J. / O 


WD 40 


WD domain , G-beta repeat 


4 .7e-35 


129.9 


JL J. uu 


"Ets 


Ets— domain 


1.8e-09 


33 .3 


1181 


v*i? ju x d y i 


Collagen triple helix repeat 
(20 copies) 


0.00016 


24 .7 


1182 




TCL1/MTCP1 family 


9.5e~56 


198.6 


1184 


RasGEF 


RasGEF domain 


1.7e-88 


307 .4 


1185 


mi to carr 


Mitochondrial carrier proteins 


1 .5e-62 


217 .3 


1187 


UPAR LY6 


u-PAR/Ly-6 domain 


0.0042 


15-6 


1188 


0m_DAP_Arg_ 


if y JL X UUAOX VAC JjJUS 1 * L 


6 . 2e-128 


430 .6 


'1193 


Stathmin 


Stathmin family 


1.8e-90 


314.0 


1194 


Statnmin 


C^sfV rv\ iCvs ^ a nh *i 1 ^ 

tscaLiiniiii j_ciiim.y 


1 . 8e-90 


314 .0 


1195 


Seel 


Seel family 


3 . 2e-183 


622 .1 


1196 


pyr^redox 


jryrxciine tiucAcOLitic;-tiiiLixpiiAu.fis 
oxidoreducta 


3 . le-32 


in . a 


1197 


Glyco — transr 
_8 


Glycosyl transferase family 8 


1 . 2e- 09 


45 . 5 


1202 


K_tetra 


K+ channel tetramerisation 
domain 


0 . 022 


-16 . 8 


1203 


aan_snort 


Snoru CIlcl JLI1 ueny J£ ciia bc 


8 . 3e-4S 


162 .3 


1206 


Ub i &.jme t ny i 1 
ran 


UOX !£/ LUyS metiiyj.Lf allSLcrabe 


1 .3e-l21 


417 .4 


1208 


7tm__3 


/ u iTci it. s nic irLLJi due [cucpuui. 


7 . 2e-09 


29 . 0 


1209 


anK 


An)c repeat 


3 . 9e-15 


63 . 7 


1210 


vATP- 
syn L. .ruo p 


>\ JL tr SyllLllaSc &uuun J. l 


2 . 5e-128 


439 .7 


1212 


zf-C2H2 


2inc finger, C2H2 type 


5.5e-l7 


69.9 


1213 


ef hand 


1717 t-i»Tt/-1 

a a nana 


3 m 2e-07 


37.4 


1219 


run 


KiNA xecoy Hi. ljlou miquj.j. * 


2 . le-40 


147 . 7 


1220 


DUF6 


Integral membrane protein DUF6 


O.OU 


21.5 


1222 


SCAN 


SCAN domain 


1 . Se-71 


251 .1 


1223 


G- gamma 


GGL domain 


3.6e-36 


129.5 


1227 


catalase 


Catalase 


o 


1158 .9 


1232 


PX 


DY f^f\ma *4 Yt 
rA UOIUalli 


2 . 2e-15 


64 .5 




p X . - 


py HiVnA'i ti 

lr«cv VA^— HIJCl 


2.2e-15 


64 .5 




FCH 


Fes/ClP4 homology domain 


3 . 3e-09 


44 . 0 


1241 


Peptidase_M2 

o 


Peptidase family M2O/M25/M40 


2e-63 


224.1 


1243 


WW 


WW domain 


0.044 


17.9 


1247 


UPFO 006 


Metal loenzyme of unknown 
function UPF0006 


6.3e-61 


215.8 


1248 


Glycos 1 rans 
f_2 


Glycosyl transferases 


4 . 5e-10 


46.9 


1249 




EF hand 


4e-il 


50.4 


12S4 


XJQ con 


TTfai emit in— eonn ncrafcinci enzvme 


2 . le-73 


257.3 


12S5 


ras 


Ra3 family 


2.2e-62 


220.7 


1256 


formyl trans 
f 


Formyl transferase 


4.9e-30 


108.3 


1259 


zf-C3HC4 


2inc finger, C3HC4 type (RING 
finger) 


5.3e-13 


46 .4 


1261 


DiHfolate re 
d 


Dihydrofolate reductase 


2.1e-69 


241-7 


1262 


G_jgluJ:ransp 
ept 


Gamma-glutamyltranspept idase 


1.8e-110 


380.4 


1263 


PAS 


PAS domain 


1.3e-08 


36.9 


1265 


LRR 


LeucXne Rich Repeat 


4.2e-22 


86.9 
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SEQ ID 
NO; 


PFAM NAME 


UCiOl^Klir J.1U1M 


p- value 


PFAM 
SCORE 


1266 


SCP 




6e-29 


108 .0 


1267 


K_tetra 


K+- channel tetramerisation 

r}fM¥lFl A TV 


2.8e-27 


104.0 




ras 


Pas faitiilv 


1.3e-85 


297.9 






Zinc finger, C3HC4 type (RING 
finger) 


4,2e-10 


37.0 


1276 


abhydrolase 


alpha/beta hydrolase fold 


5.4e-23 


89.8 


1277 


aDny di ui etoc 


alpha /beta hydrolase fold 


5. 6e-2l 


B3.1 


1279 


trypsin 




4 ,4e-41 


132.0 




PBP 


Phosphat idylethanolamine ** 
hi nrfi na orotein 


1.3e-13 


58. 7 


1285 




Ki'nr* fina^r C3HC4 tvoe (RING 


5.6e*-l4 


49.6 


1287 


anK 




1 . 7e-52 


187.8 


1294 


rnj 


T?i Vsw> /»H i r» hvnp TIT Homa n n 
r X£JI70riSC v. All ty^HS J- J- -t 


0 , 026 


20.9 


1295 


GBP 


Guanylate -binding protein 


0.00026 


-70.0 


1296 


PMP22_Claudi 
n 


PMP — 22/ J£Mr/Mlr^u/ CXaualu. i-diniJ-y 


6 • 9e-4l 


149 .3 


1297 


Rhodanese 


Rhodanese- like domain 


3.2e-14 


60.7 


1298 


LIN 


LIM domain containing proteins 


d . oe-zi, 


79.1 


1301 


rnaseA 


Pancreatic ribonucleases 


4.9e-43 


145.2 


1307 


mito^carr 


Mitochondrial carrier proteins 


2 . le-53 


186.0 


1308 


WD40 


WD domain, G-beta repeat 


1 . 6e-17 


I X . o 


1310 


UPAR LY6 


u-PAR/Ly-6 domain 


7 . le-20 


•7C c 
/ D . O 


1313 


thiored 


Thioredoxin 


3 .6e-05 


21.6 


1314 


Aa_trans 


Transmembrane amino acid 
transporter protein 


1 . 5e-67 


237 . 9 


1315 


trypsin 


Trypsin 


4 . 4e-4 1 




1320 


Riboeomal_Ll 
3 


Ribosomal protein L13 


3.9e-62 


219.8 


1327 


Armadillo_se 
9 


Armadillo/beta-catenin-like 
repeats 


0.0054 


23.4 


1328 


KRAB 


KRAB box 


0.0S2 


-5.6 


1329 


rrm 


RNA recognition motif. 


2 . le-40 


147 . 7 


1330 


Bel -2 


Apoptosis regulator proteins, 
Bel -2 family 


0.014 


-1.6 


1331 


PX 


PX domain 


2 le-10 


48 . 0 


1333 


KRAB 


KRAB box 


i" t 8e-36 


134 . 6 


1334 


UPP_syntheta 
se 


Putative undecaprenyl 
aipnospnate aynt 


2 . 3e-89 


310 . 3 


1335 


UP P_synt he t a 
'se 


Pu tat ive unde cap r enyl 


1 . 8e-59 


211 . 0 


1336 


DSPC 


Dual specificity phosphatase, 
cataiyuic aonu* 


1.2e-31 


118.6 


1337 


PSPc 


jjuax Bpsciricity piio»pii^\.«o«s » 

Celt 01 J- y fc X uuiua 


2 . 3e-12 


54. S 


- 

133 8 


TPR 


i. Jflv UULIlelJUl 


0 .00021 


28 .1 


1340 


metal thio 


Metallothionein 


0.013 


20.3 


1341 


ttiutT 




5 . 8e-09 


36.5 


1343 


Band 41 


PERM domain (Band 4.1 family) 


1.3e-38 


122.5 


1344 


Kelch 


Kelcn mot it 


1 . 4e~44 


161 . 5 


1345 


Antifreeze 


Antifreeze protein 


1 . 2e-10 


48 . 8 


1347 


3Beta__HSD 


3 -beta hydroxys tero id 
dehydrogenase/ 1 some r a 


0 . 086 


-177 .2 


1348 


BTB 


BTB / POZ doma in 


5 ,3e-28 


106. 5 


1349 


DUF6 


Integral membrane protein DUF6 


0-033 


15.8 


1350 


myosin_head 


Myosin head tmotor domain) 


0 


1088.7 


1352 


Nramp 


Natural resistance-associated 
macrophage pro 


1.2e-202 


686.6 


1353 


S_100 


S-100/ICaBP type calcium 
binding domain 


5.3e-23 


89.9 


1355 


DEAD 


DEAD/DEAH box helicase 


3.6e-65 


209.0 


1356 


C2 


C2 domain 


2.4e-15 


64 .4 


1357 


RBD 


Raf-like Ras-binding domain 


4.2e-57 


203 .1 


1360 


Zf-C2H2 


Zinc finger, C2H2 type 


7.4e-141 


481.4 


1361 


HMG14 17 


HMG14 and HMG17 


7.9e-40 


14 5. 7 
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SEQ ID 
NO: 


PB"AM NAME 


DES CRI PTION 

1 it t WAV JU JT * A vil 




PFAM 
SCORE 


1362 


SIS 


SIS domain 


3 . 8e-30 


113 . S 


1363 


SIS 


SIS domain 


1 .3e-28 


108 . 5 


1364 


ig 


Immunoglobulin domain 


0 . O0026 


19 . 0 


1368 


K tetra 


K+ channel tetramerisation 
domain 


1 . le-16 


68 . 9 


1371 


Collagen 


Collagen triple helix repeat 
(20 copies) 


2.2e-113 


390.1 


1372 


DnaJ 


DnaJ domain 


6 .6e-36 


132 . 7 


1376 


KRAB 


KRAB box 


2 . le-38 


141.0 


1378 


ELM 2 


ELM 2 domain 


2e-23 


91 .3 


1366 


thiored 


Thioredoxin 


1 ,2e-23 


82 . 8 


1381 


ank 


Ank repeat 


2.3e-S3 


290 .4 


1382 


BTB 


BTB/POZ domain 


3e-ll 


50 . 8 


13B3 


WD40 


WD domain , G-beta repeat 


1 .6e-19 


78 .3 


1384 


WD40 


WD domain, G-beta repeat 


6 .3e-24 


92.3 


1387 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


1 . le-09 


35.6 


1389 


Zf-C2H2 


Zinc finger, C2H2 type 


5.5e-50 


179. S 


1390 


zf-C2H2 


Zinc finger, C2H2 type 


2 .Se-85 


296 . 9 


1393 


kiitesin 


Kincsin motor domain 


7 .8e-188 


637 .4 


1394 


zf-C2H2 


Zinc finger, C2H2 type 


1.2e-49 


178.4 


1398 


KRAB 


KRAB box 


5 . le-22 


8^.6 


1402 


bZIP 


bZIP transcription factor 


0 . 035 


13 . 1 


1405 


sugar tx 


Siicra** (and ot* Vipt*"! t* t*p» n flnnrt* e* y 


0 . 003 


-101 , 5 


1406 


RhoGAP 


PhoflAP dOTll?! "1 Tt 


8 . 9e-47 


168.8 


1407 






le-35 


132 . 1 


1408" 




UcUCluc ftx t^ll ncpcav 




DO . U 


1409 


at ~~ 




6e-54 


1 QO C 
17£ . © 


1410 


an3c 


Ante r(*nf*ah 


1 ,6e-17 


71 . 6 - 


1412 


Ribosomal^liS 


ribosomal L5P family c-terminus 


8 .2e-5B 


205.5 


1415 


trypsin 


Trypsin 


4 .7e-85 


270 .4 


1416 


amxnotran 1 


Aminotransferases class-X 


4 . 4e— OS 


-91.2 


1417 


OX 


SI RNA binding domain 


1 . Se-C 7 


33.1 


14 19 


WD4 0 


WD domain, G-beta repeat 


2 . 2e- 09 


44.6 


1422 




Cadherin domain 


8 . 3e— 42 


152 . 3 


1424 


SH3 


orlJ uOulaJLIl 


£, . 3C-OU 


*> a a "* 
zdu. j 


1425 


PHD 


PHD — f inge r 


3 . 2e-17 


70.6 


1426 


PHD 


phd- finger 


3 .2e-l7 


70 .6 




Arf Gap 


Putative GTP-ase activating 

p.LUl--C5Hl 1UL .PiX. JL 


le-37 


13 8*8 


1428 


he li case C 


H^l i fisaog eonsprvi'd t"^y*iti*i rial 

domain 


le-26 


102 . 2 


1429 


WD4 0 


WD domain, G-beta repeat 


3 . 9S-07 


37.2 


1430 


Xnos it ol_P 


Inositol monophosphatase family 


2 .Se-10 


40.2 


1431 


mi to carr 


Mitochondrial carrier proteins 


4 .3e-83 


287.7 


1433 


Clq 


Clq domain 


2 .9e-16 


66.2 


1434 


WD40 


WD domain, G-beta repeat 


1.6e-13 


58.3 


1435 


lnos-1- 
P_synth 


Myo - inos i t ol - 1 -phospha t e 
synthase 


7e-228 


770.4 


1436 


rrm 


RNA recognition motif* 


1.4e-34 


128.3 


1438 


ig 


Immunoglobulin domain 


1.3e-12 


45.6 


1440 


G_Adapt_CT 


Gamma-adapt in, C-terminus 


3.4e-67 


236.7 


1441 


G_Adapt_CT 


Gamma -adapt in, C-terminus 


3 .4e-67 


236.7 


1443 


Kelch 


Kelch motif 


0.00013 


28.7 


1446 


ARID 


ARID DNA binding domain 


1.8e-21 


84.7 


1447 


zf-C2H2 


Zinc finger, C2H2 type 


9.4e-28 


105.6 


1448 


AMP-binding 


AMP-binding enzyme 


2.6e-07 


-14*. 1 


1451 


rrm 


RNA recognition motif. 


6.Se-21 


82.9 


1454 


ig 


Immunoglobulin domain 


5.6e-44 


146 .7 


1455 


Sialyl trans f 


Sialyl transferase family 


5.4e-21 


83 .2 


1460 


Aldose_epim 


Aldose l-epimerase 


1.9e-3S 


131.2 


1461 


C2 


C2 domain 


4e-18 


73 .6 


1470 


TIG 


IPT/TIG domain 


3.1e-19 


77.3 


1472 


PseudoU_synt 


RNA pseudouridylate synthase 


4.36-16 


66.9 
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NO : 








PFAM 
SCORE 




h_2 








1474 


DEMN 


DENN (AEX-3 ) domain 


1 .3e-44 


161 . 6 


1475 


Catiori__ef f lu 
x 


Cation efflux family 


4 r 6e-49 


176.4 


1477 


TBC 


TBC domain 


8e-47 


169. 0 


1478 


rrm 


RNA recognition motif. 


2fe-21 


84 .6 


1480 


ig 


Immunoglobulin domain 


5.5e-06 


24.3 


1484 


pha 


ICXUUUSJLC W^HUXil^ ^JlUUClli ax^/uci 

subuni 


0 . 028 


-225 , 9 


i Ant; 






1 . 8e-68 


240 . 9 


1486 


pkinase 


Eukaryotic protein kinase 
domain 


9 . 5e-13 


49.9 


1488 


he 1 i case_C 


Helicases conserved C- terminal 
domain 


1 . 4e- 1 5 




1489 


DUF89 


Protein of unknown function 
DUF89 


0 . 079 


-132 . 4 


1490 


ECH 


Enoyl-CoA hydra tase/ isomer as e 
family 


5 . 2e-41 


149 . 7 


1491 


guanylate_cy 
c 


Adenylate and Guanylate cyclase 
catalyt 


5 . 9e-46 


166 . 1 


1492 


LRR 


Leucine Rich Repeat 


3 . 4e-19 


77.2 


1495 


Zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7.le-10 


36.3 


1497 


pkinase 


Eukaryotic protein kinase 
domain 


ie-22 


85*8 
- 


1500 


SH3 


SH3 domain 


9 . 3e—05 


27.2 


1502 


home o box 


Homeobox domain 


0.084 


13.8 


1503 


homeobox 


Homeobox domain 


0 . 084 


13 . 8 


1505 


EGF 


EGF -like domain 


2 . 7e-23 


90 . 8 


1506 


UCH-2 


Ubiquitin carboxyl -terminal 
hydrolase family 


2 ,7e-21 


84.2 


1508 


Peptidase^_M2 
0 


Peptidase family M20/M25/M40 


2 . 8e-28 


101 * 8 


1 CI 1 


DY 

lr A 


PX domain 


1 9e-ll 


51.5 


1512 


Sulfatase 


oui LataSc 


2 . 8e-35 


130.7 


1516 


Syntaxin 


Syntaxin 


0 .011 


-62.3 




amino t ran 3 


Aminotransferases class- III 


9 7e~106 


305 . 6 


1520 




Tmmi i Firva 1 nlriil i n dnma i n 

XtlUUUiAU^j ^ il«L XJll VlwllLd^»&l 


0 . 075 


11 . 0 


1521 


RA 


Ras association (RalGDS/AF-6) 


0.013 


13.3 


1523 


RhoGAP 


RhoGAP domain 


2.5e-05 


18.7 


1528 


riUH U 


vvu aoniain f y-Dcta irepecit 


5 . 4e~24 


93 . 1 


4.3J O 


IMS 


iuips/ oatllO X dulX xy 


7 8e-95 


328 . 5 


1538 


FYVE 


UVVT? vi«r» -F ■> vino V 
f 1VA ^ X.i * 1- X 1 iy C X 


3 . 2e-27 


101 . 5 


1539 


DAGKc 


Diacylglycerol kinase catalytic 
domain 


6e-07 


36.5 


1540 




kJcu._Lcii cixo mx sin type x prutexn 


q 


1184 . 7 




SAP 




6e-06 


33.2 


1654 


Amino__oxiria s 


Flavin containing amine oxidase 


3.2e-43 


157.0 




Aid i no ox iosis 

e 


i — i — — j t k •■ ■ 
Flavin containing amine oxidase 


3 . 2e-43 


157 . 0 


1656 


RhoGEF 


RhoGEF domain 


1.4e-24 


95.1 


103 / 




GTPase of unknown function 


0 * 0011 


-45.5 


1659 


uch-2 


Ubiquitin carboxyl - terminal 
hydrolase family 


2.5e-ll 


51.1 


1660 


act in 


Actin 


6.6e-21 


69.9 


1661 


BAH 


BAH domain 


1.7e-82 


287.5 


1662 


vwa 


von Willebrand factor type A 
domain 


0 


1909.4 


1663 


WD40 


WD domain, G-beta repeat 


1.4e-67 


23 7.9 


1667 


zf-C2H2 


Zinc finger, C2H2 type 


1.3e-93 


324 .4 


1669 


NOll_Nop2_Su 
n 


NOLI /U0P2 / sun family 


1.36-23 


84 .3 


1671 


SH2 


Src homology domain 2 


5.4e-15 


46.9 
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SEQ ID 
NO: 


PFAM WAME 




p-value 


PFAM 
SCORE 


16 72 


— — ■ ... — 
char onto 


i ^hromo 1 ( CHRromatin 
Organization Modifier) 


2.ie-18 


67. 7 


1674 


Zt" wl 


Zinc finger C-x8 - C**x5- C-x3 -H 
type 


0.0025 


17.6 


1676 


Glyco hydro^ 
47 ~ 


Glycosyl hydrolase family 4 7 


1.8e-187 


636.2 


1677 


G L y c o_hyd iro 

47 


Glycosyl hydrolase family 47 


4. 5e-74 


259.5 


1680 




wn Hnmain G-heta reoeat 


l.le-27 


105.5 


1681 


WD40 


wn rfnmain G-fc>efca reoeat 


l.le-27 


105.5 


1683 


MUID UCD1 


GTPase of unknown function 


1. 8e-78 


274 .1 


16S1 




RNA recognition motif. 


1.8e~37 


137.9 


1632 




rpeoonit ion motif. 


1.8e-37 


137 .9 


1633 


AAA 


ATPases associated with various 

e*f*l 1 n 1 a r* act 


1.3e-81 


284 .5 


1697 


r e n jl c x cuul 


Ferric reductase like 
transmembrane com 


8 -4e-82 


285.2 






Ferric reductase like 
transmembrane com 


3.5e-53 


190.1 


1699 


z£-C2H2 


Zinc finger, C2H2 type 


4.4e-34 


126.6 




arf 


ADP-ribosylation factor family 


9e-19 


75.8 


1702 


GTP EFTU 


Elongation factor Tu family 


0.014 


11.4 


1703 




SCAN domain 


l,8e-54 


194.4 


1707 


pkinase 


i?n liraw/rtt* i <*• kinase 
domain 


1 . 2e-88 


307.9 


i in q 




wri domain. G-beta reueat 


0. D035 


24.0 


1710 


LRR 


Leucine Rich Repeat 


1.2e-30 


11S. 3 


1711 


WW 


WW doma i n 


7 . 6e-12 


52 . 8 


1712 


ank 


Ank repeat 


4.2e-34 


126.7 


1713 


zf -CCCH 


Zinc riiigeir u-XB'L-XD-L-Ao-n 
type 


2 . 6e-09 


38.3 


1714 


zf-CCCH 


Zinc finger L-xo-u-XD-u-xj-ti 
type 


2 . 6e-09 


38.3 


1715 


ras 


Ras family 


4 , 4e-41 


149. 9 


1718 


HMG_box 


HMG (high mobility group) box 


8 . 3e-21 


82 .6 


1719 


TBC 


TBC domain 


l.le-45 


165.2 


1721 


ELH 


Helix-loop-neiax JNA-Binaing 
domain 


9 . 2e-10 


45.9 


1723 


da nn 


Double -stranded RNA binding 
motif 


2.9e-05 


30.9 


1724 


RrnaAD 


Ribosomal RNA adenine 


0.045 


9.2 


1725 






5.9e-40 


146.2 


172S 


HAT 


hat (Walf-A-TPR) ^eoeats 


2.9e-44 


160.5 


172 8 


ef hand 


RP hand 


5.1e-20 


79.9 




1 


Histone deacetylase family 


1.7e-104 


360.6 


1735 


LRR 


Leucine Rich Repeat 


4.6e-34 


126-6 


1739 


PI -PLC -X 


Phosphatidylinositol-specific 
phpsphol i pa s e 


0.0023 


16-1 


1743 


ras 


Ras family 


3 .7e-10 


-21.3 


1744 


ras 


Has family 


3 .7e-l0 


-21.3 


1745 


RasGEF 


RasGEF domain 


3.2e-49 


176.9 


1746 


adh_short 


short chain dehydrogenase 


7.1e-08 


34.6 


1751 


zf-C2H2 


Zinc finger, C2H2 type 


9e-.39 


142.2 


1754 


£n3 


Fibronectin type III domain 


S.5e-101 


348.9 


1756 


zf-C2H2 


Zinc finger, C2H2 type 


6,3e-93 


322.1 


1758 


rrm 


RMA recognition motif. 


0.017 


21.2 


1760 


Nop 


Putative snoRNA binding domain 


6.ie-95 


328.8 


1761 


Nop 


Putative snoRNA binding domain 


6.1e-95 


328. 8 


1765 


MMR HSR1 


GTPase of unknown function 


6.4e-41 


149.4 


17S9 


CN hydrolase 


Carbon- nitrogen hydrolase 


3e-06 


-43.9 


1775 


ank 


Ank repeat 


4.1e-07 


37.1 


1779 


OxysterolJBP 


Oxysterol -binding protein 


4.7e-56 


199.6 


1783 


RhoGEF 


RhoGEF domain 


1.6e-23 


91-6 


1784 


RhoGBF 


RhoGEF domain. 


1.6e-23 


91.6 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p -value 


PFAM 
SCORE 


1785 


irm 


RNA recognition motif. 


6.4e-14 


59.-7 



TRADOCS: 1 4 1 6227. 1 (%CRN0 1 l.DOC) 
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TABLE 5 



SEQ ID NO: 


POSITION OF 


MaxS (MAXIMUM 


Means (MEAN 




SIGNAL IN AMINO 


SCORE) 


SCORE) 




AC IB SEQUENCE 






1 


1-21 


0.991 


0.955 


a 


1-31 


0.995 


0.944 


3 


1-33 


0 .949 


0.736 


4 


1-19 


0.970 


0.951 


S 


1-26 


0 .971 


0.863 


6 


1-26 


0.971 


0.863 


7 


1-26 


0.971 


0.863 


8 


1-26 


0 .971 


0.86":* 


9 


1-46 


0.982 


0.901 


10 


1-21 


0 .991 


0.955 


11 


1-23 


0.989 


0.899 


12 


1-25 


0.955 


0.803 


13 


1-18 


0.932 


0.625 


14 


1-18 


0.938 


0.876 


15 


1-25 


0.941 


0.811 


16 


1-17 


0.972 


0.939 


17 


1-27 


0 .964 


0.777 


18 


1-16 


0.914 


0.657 


19 


1-19 


0.953 


0.840 


20 


1-20 


0 .935 


0.701 


21 


1-22 


0,974 


0.850 


22 


1-33 


0 .9C1 


0.895 


23 


1-19 


0 .991 


0.959 


24 


1-31 


0.995 


0.944 


25 


1-22 


0.976 


0.935 


26" 


1-27 


0.996 


0.928 


27 


1-24 


0.953 


0.739 


28 


1-21 


0.906 


0.688 


29 


1-31 


0.986 


0.841 


30 


1-28 


0.980 


0.893 


31 


1-19 


0.993 


0.976 


32 


1-22 


0.998 


0.909 


35 


1-33 


0.949 


0.736 


36 


1-33 


0.949 


0.736 


46 


1-19 


O.S70 


0.951 


67 


1-25 


0.968 


0.848 


71 


1-18 


0.949 


0.845 


72 


1-30 


0 .991 


0.919 


75 


1-29 


0 . 958 


0.854 


88 


1-20 


0.98S 


0.94S 


94 


1-33 


0. 934 


0.943 


97 


1-46 


O.S64 


0.595 


103 


1-49 


0.983 


0-570 


108 


1-26 


0.978 


0.885 


111 


1-23 


0.989 


0.899 


126 


1-25 


□ .955 


0.803 


129 


1-19 


0.963 


0.918 


138 


1-29 


0.971 


0.844 


143 


1-18 


0.914 


0.628 


148 


1-20 


0.969 


0.904 


156 


1-25 


0.941 


0.811 


158 


1-22 


0.979 


0.927 


160 


1-17 


0.972 


0.939 


161 


1-48 


0.903 


0.571 


162 


1-25 


0.937 


0.729 


168 


1-16 


0.939 


0. 826 


171 


1-27 


0.964 


0.777 


178 


1-21 


0.945 


0.825 


180 


1-27 


0.981 


0. 941 


187 


1-28 


0.982 


0.936 


190 


1-19 


0.953 


0.840 


196 


1-22 


0.975 


0.916 


197 


1-22 


0.963 


0.936 
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gen Tn NO * 


POSITION OF 


MaxS (MAXIMUM 


MeanS (MEAN 




SIGNAL IN AMINO 


SCORE) 


SCORE) 




ACID SEQUENCE 






199 


1-20 


0.93 5 


0 . 701 


200 


1-23 


0.977 


0 .773 


206 


1-30 


0,984 


0 .890 


207 


1-19 


0-990 


0.924 


208 


1-22 


0.974 


0.850 


210 


1-40 


0.940 


0.670 


211 


1-28 


0.971 


0 .849 


216 


1-24 


0.986 


0.956 


218 


1-33 


0.961 


0.895 


219 


1-19 


0.970 


0.871 


221 


1-19 


0.904 


0.553 


222 


1-21 


0.917 


0.555 


230 


1-19 


0.991 


0 .959 


231 


1-26 


0.953 


0 .800 


232 


1-25 


0.988 


0 .826 


239 


1-23 


0.969 


0.828 


240 


1-17 


0 .982 


0.955 


241 


1-17 


0 . 982 


0 .955 


245 


1-3 0 


0 .970 


0 .722 


248 


1-22 


0.976 ~ 


0 .935 




1-23 


0 . 968 


0 . 94 0 




1-18 


0 . 971 


0 . 923 


.GO -L 


1-24 


0 .883 


0 .587 


265 


i - ■ ft 

X _ 0 


0 . 939 


0 . 868 


272 


1-24 


0 - 953 


0 .739 


283 


1 - *7 1 


0 . 906 


0 .688 




1 - "> Q 
X - *S J» 


0 .997 


0 . 854 


290 


1-31 


0 .986 


0 .841 


302 


1-28 


0 . 980 


0 .893 


304 


1-16 


0 .907 


0 .635 


312 


1-19 


0 .993 


0.976 


313 


1-17 


0 . 930 


0.753 


323 


1-22 


0 .998 


0.909 


324 


1-17 


0 .982 


0.954 


328 


1-19 


0 .971 


0.865 


329 


1-22 


0.963 


0.924 


330 


1-33 


0.978 


0.841 


331 


1-24 


0.920 


0.712 


332 


1-24 


0.975 


0.881 


353 


1-19 


0,984 


0.941 


334 


1-20 


0.899 


0.567 


335 


1-27 


0 .942 


0.813 


336 


1-20 


0 .952 


0.850 


337 


1-38 


0 .942 


0.653 


338 


1-27 


0 .973 


0.772 


339 


1-36 


0.979 


0.804 


340 


1-27 


0.888 


0.597 


343 


1-19 


0.971 


0 .865 


344 


1-22 


0 .994 


0.928 


345 


1-17 


0.966 


0.687 


346 


1-19 


0.936 


0.822 


347 


1-22 


0 .963 


0.924 


349 


1-24 


0 .982 


0.966 


351 


1-21 


0 .918 


0.815 


352 


1-31 


0 .988 


0 .912 


354 


1-31 


-0.974 


0.839 


"'3EJ5 


1-29 


0.932 


0.632 


356 


1-15 


0.994 


0.969 


357 


1-33 


0.935 


0.726 


360 


1-27 


0.938 


0.827 


Ul 


1-25 


0.954 


0.674 


362 


1-22 


0 .929 


0.788 


363 


1-21 


0.881 


0.715 


364 


1-33 


0 .978 


0 .841 


365 


1-33 


0 .978 


0.841 
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SEQ ID NO: 


POSITION OP 


MaxS (MAXIMUM 


MeanS (MEAN 




SIGNAL IN AMINO 


SC0R3) 


SCORE) 




ACID SEQUENCE 






366 


1-21 


0.916 


0.820 


367 


1-19 


0.93 6 


0 .822 


368 


1-29 


0.972 


0.874 


370 


1-24 


0.920 


0.712 


371 


1-24 


0.961 


0.773 


372 


1-27 


0.919 


0.768 


373 


1-19 


0.986 


0.945 


375 


1-32 


0.994 


0.932 


376 


1-34 


0.987 


0.810 


3*77 


1-17 


0.995 


0.950 


378 


1-49 


0.971 


0.749 


380 


1-20 


0.966 


0.874 


381 


1-20 


0.928 


0 .782 


382 


1-19 


0.986 


0.934 


383 


1-28 


0.965 


0.829 


384 


1-39 


0.970 


0.551 


386 


1-24 


0.975 


0.881 


3S8 


1-30 


0.989 


0.868 


389 


1-19 


0.984 


0.941 


390 


1-26 


0.971 


0.782 


392 


1-20 


0.981 


0.900 


393 


1-16 


0.968 


0.890 


394 


1-23 


0.937 


0.701 


397 


1-22 


0.985 


0.854 


399 


1-46 


0.977 


0.698 


401 


1-20 


0.899 


0.567 


402 


1-22 


0.967 


0.931 


403 


1-27 


0.992 


0.934 


404 


1-19 


0.991 


0.973 


405 


1-23 


0.994 


0.921 


407 


1-35 


0.987 


0.658 


408 


1-39 


0.976 


0.551 


409 


1-33 


0 . 897 


0.570 


410 


1-25 


0.990 


0 . 962 


411 


1-38 


0 . 977 


0 . 827 


412 


1-20 


0.944 


0.768 


413 


1-20 


0. 988 


0 . 965 


414 


1-46 


0 - 993 


0 .638 


415 


1-23 


0 . 981 


0 . 940 


417 


1-29 


0. 941 


0 .672 


418 


1-20 


0.952 


0.850 


419 


1-19 


0.986 


0.967 


420 


1-29 


0.965 


0.861 


421 


1-22 


0. 889 


0.785 


422 


1-48 


0.982 


0.862 


424 


1-19 


0. 979 


0.933 


428 


1-38 


0.S42 


0.653 


430 


1-18 


0.947 


0.595 


432 


1-33 


0.957 


0.789 


433 


1-26 


0.979 


0.904 


434 


1-27 


0.962 


0.777 


435 


1-24 


0.998 


0.977 


436 


1-27 


0.973 


0.772 


443 


1-15 


0.966 


0 .940 


448 


1-36 


0.979 


0.804 


453 


1-41 


0.958 


0.609 


4S5 


1-33 


0.943 


0.606 


457 


1-27 


0.888 


0.597 


462 


1-16 


0.925 


0.681 


486 


1-27 


0.972 


0.845 


495 


1-24 


0.917 


0.636 


498 


1-26 


0.993 


0.890 


505 


1-20 


0.976 


0.926 


507 


1-17 


0.966 


0.687 


510 


1-23 


0.930 


0.593 
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SEQ ID NO; 


POSITION OF 


MaxS (MAXIMUM 


Means (MEAN 




SIGNAL IN AMINO 


SCORE) 


SCORE) 




ACID SEQUENCE 






511 


1-23 


O.930 


0.593 


512 " 


1-23 


0.930 


0.593 


S15 


1-18 


0.978 


0.956 


S23 


1-19 


0.936 


0.822 


529 


1-22 


0.963 


0.924 


545 


1-24 


0.982 


0.966 


55D 


1-30 


0.933 


0.713 


552 


1-21 


0.973 


0.912 


554 


1-23 


0.969 


0.784 


571 


1-21 


0.91B 


0. 815 


574 


1-31 


0.968 


0.912 


580 


1-39 


0.525 


0.556 


594 


1-31 


0.974 


0.839 


608 


1-29 


0.932 


0.632 


609 


1-29 


0 .532 


0.632 


610 


1-21 


0.990 


0.948 


621 


1-15 


0.S94 


0.969 


623 


1-33 


0.935 


0.726 


653 


1-27 


0.938 


0.827 


668 


1-22 


0.929 


0.788 


677 


1-16 


0.948 


0. 807 


685 


1-21 


0 .881 


0.715 


699 


1-22 


0.975 


0. 816 


702 


1-31 


0 .968 


0. 398 


707 


1-16 


0 . 860 


0 . 562 


713 


1-25 


0 .966 


0. 743 


718 


1-19 


0 .936 


0 . 822 


719 


1-20 


0 .961 


0. 824 


729 


1-29 


0. 972 


0. 874 


735 


1-46 


0 .903 


0.598 


746 


1-14 


0.916 


0.730 


til 


1-22 


0.965 


0. 876 


748 


1-29 


0.968 


0 . 785 


759 


1-24 


0.961 


0.773 


767 


1-27 


0.919 


0.768 


768 


1-33 


0.900 


<r:s85 


773 


1-42 


0.959 


0.702 


779 


1-19 


0.986 


0.945 


797 


1-19 


0.944 


0.759 


798 


1-19 


0 .900 


0.S68 


820 


1-17 


0.995 


0. 950 


827 


1-49 


0 .971 


0.749 


848 


1-20 


0.968 


0. 874 


864 


1-20 


0.928 


0.782 


866 


1-19 


0.986 


0.934 


873 


1-23 


0.948 


0.886 


881 


1-28 


0.965 


0. 829 


887 


1-39 


0.970 


0.551 


927 


1-30 


0.989 


0.868 


934 


1-48 


0.988 


0 . 777 


93 9 


1-39 


0.994 


0.889 


944 


1-26 


0.971 


0. 782 


950 


1-29 


0.957 


0. 845 


963 


1-20 


0.981 


0.900 


964 


1-20 


0.886 


0.558 


973 


1-16 


0. 968 


0. 890 


980 


1-34 


0. 961 


0.749 


981 


1-20 


0.953 


0.822 


964 


1-12 


0.938 


0. 780 


1015 


1-22 


0.985 


0.854 


1040 


1-46 


0.977 


0. 698 


1052 


1-18 


0.969 


0.842 


1059 


1-20 


0.927 


0 . 867 


1065 


1-33 


0.983 


0.918 


1069 


1-22 


0.993 


0.935 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS {MEAN 
SCORE) 


1075 


1-27 


0.992 


0. 934 


1080 


1-19 


0 .931 


0.829 


1092 


1-19 


0.991 


0.973 


1094 


1-4S 


0.992 


0.653 


1095 


1-30 


0.974 


0.929 


1105 


1-23 


0.994 


0 .921 


1123 


1-35 


0.987 


0.658 


1138 


1-32 


0.954 


0.613 


1140 


1-33 


0.989 


0.789 


1142 


1-33 


0.897 


0 .570 


1152 


1-25 


0.990 


0.962 


1170 


1-38 


0.977 


0.827 


1176 


1-20 


0.944 


0.768 


1187 


1-20 


0.988 


0.965 


1189 


1-35 


0.967 


0.839 


1192 


1-46 


0.993 


0.638 


1193 


1-16 


0.925 


0.710 


1197 


1-29 


0.985 


0.853 


1208 


1-23 


0 .981 


0.940 


1225 


1-29 


0.941 


0.672 


1245 


1-19 


0 .986 


0.967 


1258 


1-29 


0.965 


0.861 


1265 


1-22 


0.889 


0.785 


1266 


1-20 


0 .944 


0.809 


1276 


1-48 


0.982 


0.B62 


1292 


1-19 


0 .979 


0.933 


1296 . 


1-21 


0 .984 


0.944 


1297 


1-19 


0.984 


0.953 


1332 


1-38 


0.942 


0.653 


1358 


1-18 


0 .947 


0.595 


1371 


1-33 


0.957 


0.789 


1380 


1-26 


0 .979 


0.904 


1397 


1-27 


0 . 962 


0.777 


1399 


1-23 


0.997 


0.960 


14 04 


1-24 


0.998 


0.977 


1410 


1-15 


0.946 


0.845 


1414 


1-24 


0.913 


0 .588 


1415 


1-19 


0.982 


0 .929 


1416 


1-12 


0.931 


0.891 


1418 


1-30 


0 .933 


6.563 


1420 


1-20 


0 .881 


0.561 


1421 


1-19 


0.990 


0.968 


1423 


1-17 


0.968 


0.863 


1424 


1-21 


0.885 


0.591 


1425 


1-24 


0.913 


0.588 


1426 


1-24 


0.913 


0.588 


1428 


1-25 


0 .967 


0.899 


1430 


1-34 


0.977 


0.819 


1431 


1-28 


0.979 


0.923 


1432 


1-36 


0 .957 


0.613 


1433 


1-32 


0.921 


0.753 


> 1434 


1-39 


0.983 


0. 621 


1435 


1-25 


0.910 


0.631 


1436 


1-42 


0.988 


0 . 868 


1437 


1-22 


0.99B 


0 . 980 


1442 


1-20 


0.918 


0 . 753 


1448 


1-12 


0.931 


0 .891 


1462 


1-18 


0.968 


0 .888 


1490 


1-20 


0.881 


0.561 


1518 


1-17 


0.968 


0 . 863 


1525 


1-21 


0.885 


0.591 


1547 


1-28 


0.974 


0.891 


1561 


1-25 


0.967 


0.899 


1S80 


1-17 


0.923 


0 .824 


1593 


1-28 


0.979 


0.923 
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SBQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


Means (MEAN 
SCORE) 


1596 


1-16 


0-929 


0.709 


1601 


1-36 


0.957 


0.613 


1606 


1-22 


0.979 


0.831 


1607 


1-20 


0.974 


0.770 


1608 


1-32 


0.921 


0 .753 


1614 


1-33 


0.969 


0.B29 


l£l(i 


1-20 


0.959 


0.869 


1625 


1-39 


0.983 


0.621 


1632 


1-25 


O.910 


0 .631 


1636 


1-33 


0.897 


0.591 


1639 


1-42 


0.988 


0.868 


1645 


1-20 


0.927 


0.568 


1647 


1-17 


0.923 


0.742 


1648 


1-22 


0.998 


0.980 



TRADOCS:!4I6234.1(%CR%01 LDOC) 
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TABLE 6 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number_ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




1 


17B7 


3573 


5359 


784CIP2_1 


1103 


2 


1788 


3574 


5360 


784CIP2_2 


2673 


3 


1789 


3 57,5 


5361 


784CIP2_3 


4117 


4 


1790 


3576 


5362 


784CIP2_4 


5556 


S 


1791 


3577 


53 63 


784CIP2_5 


5562 


6 


1792 


3 578 


5364 


784CIP2_6 


5562 


7 


1793 


3579 


5365 


784CIP2_7 


5562 


8 


1794 


3580 


5366 


784CIP2 8 


5562 


9 


1795 


3581 


5367 


784CIP2_9 


5563 


10 


1796 


3582 


5368 


784CIP2_10 


5564 


11 


1797 


3583 


5369 


784CIP2_11 


5565 


12 


1798 


3584 


5370 


784CIP2_12 


5689 


13 


1799 


3585 


5371 


784CIP2_13 


5729 


14 


1800 


3586 


5372 


784CIP2_14 


5745 


15 


1801 


3587 


5373 


7 84CIP2_15 


5777 


16 


1802 


3588 


5374 


784CIP2 16 


5777 


17 


1803 


3589 


5375 


784CIP2_17 


5789 


18 


1804 


3590 


5376 


784CIP2_18 


5792 


19 


1805 


3591 


5377 


784CIF2_19 


5804 


20 


1806 


3592 


5378 


784CIP2_20 


5805 


21 


1807 


3593 


5379 


784CIP2_21 


5805 


22 


1808 


3594 


5380 


784CIP2 22 


5844 


23 


1809 


3595 


5381 


784CIP2 23 


5844 


24 


1810 


3596 


5382 


784CIP2_24 


5850 


25 


1811 


3597 


5383 


784CIP2 25 


5867 


26 


1812 


3598 


5384 


784CIP2 26 


5973 


27 


1813 


3599 


53 85 


784CIP2_27 


5995 


28 


1814 


3600 


5386 


784CIP2 28 


5995 


29 


1815 


3601 


5387 


784CIP2 29 


6005 


30 


181S 


3502 


538B 


784CIP2_30 


6007 


31 


1817' 


3603 


5389 


784CIP2J51 


6007 


32 


1818 


3604 


5390 


784CIP2_32 


6009 


33 


1819 


3505 


53S1 


784CIP2_33 


6012 


34 


1820 


3606 


5392 


7B4CIP2_34 


6015 


35 


1821 


3607 


53S3 


784CIP2_35 


6016 


36 


1822 


3608 


5394 


784CIP2_36 


6016 


37 


1823 


3609 


5395 


7B4CIP2 37 


6018 


38 


1824 


3610 


5396 


784CIP2_ 38 


6018 


39 


1825 


3611 


5397 


784CIP2_39 


6018 


40 


1826 


3612 


5398 


784CIP2_40 


6023 


41 


1827 


3613 


5399 


784CIP2 41 


6070 


42 


1828 


3614 


5400 


784CIP^ 42 


6081 


43 


1829 


3615 


5401 


7o4CIP2 43 


6089 


44 


1830 


3616 


5402 


784CIP2__4 4 


6118 


45 


1831 


3617 


5403 


784CIP2 > _45 


6118 


46 


1832 


3618 


54 04 


784CIi'J__4b 


cTTrt - 

O 1.-3 u 


A *7 
* f 


1 fi 1 "3 


3619 


cane 




6177 


48 


1834 


3620 


5406 


784CIP2_4B 


6189 


49 


1835 


3621 


5407 


784C1P2_49 


6191 


50 


1836 


3622 


5408 


784CIP2_50 


6204 


51 


1837 


3623 


5409 


784CIP2 51 - 


6204 


52 


1838 


3624 


5410 


784CIP2_52 


6284 


53 


1839 


3625 


5411 


784CIP2_53 


6367 


54 


1840 


3626 


5412 


784CIP2 54 


6436 


55 


1841 


3627 


5413 


784CIP2_55 


6442 


56 


1842 


3628 


5414 


7B4CIP2_56 


6445 


57 


1843 


3629 


5415 


784CIP2_57 


6457 


58 


1844 


3630 


5416 


784CIP2_58 


6458 


59 


1845 


3631 


5417 


7e4CIP2_S9 


64S8 
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CEO ID NO * 




SEO ID NO* 


SEQ ID 


Pri /"i Y" "i t*ir 

n Jk V JL Up Jf 




of full- 


NO : of 


of contig 


NO: 


docket number 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




60 


1846 


3632 


5418 


784CIP2_60 


6462 


61 


1847 


3633 


5419 


784CIP2_61 


6472 


62 


1848 


3634 


5420 


784CIP2_62 


6499 


63 


1849 


3635 


5421 


784CIP2 63 


6499 


64 


1850 


3636 


5422 


784CIP2_64 


6505 


65 


1851 


3637 


S423 


784CIF2_65 


6534 


66 


1852 


3638 


5424 


784CIP2_66 


6534 


67 


1853 


3639 


542S 


784CIP2_67 


6540 


68 


1854 


3640 


5426 


784CIP2_68 


6550 


69 


1855 


3641 


5427 


784CIP2_69 


6550 


70 


1856 


3642 


5428 


784CIP2J70 


6592 


71 


1857 


3643 


5429 


784CIP2_71 


6645 


72 


1358 


3644 


S430 


784CIP2J72 


6671 


73 


1959 


3645 


5431 


784CIP2_73 


6763 


74 


1860 


3646 


5432 


784CIP2J74 


6763 


75 


1361 


3647 


5433 


784CIP2_7S 


6786 


76 


1862 


3648 


5434 


784CIP2_76 


6824 


77 


1863 


3649 


5435 


784C1P2_77 


6830 


78 


1864 


3650 


543 6 


784CIP2_78 


6831 


79 


1865 


3651 


54 3 7 


784C1P2 79 


6832 


80 


1866 


3652 


5438 


784C1P2 80 


6834 


81 


1867 


3653 


54 39 


784CIP2 81 


6834 


82 


1868 


3654 


5440 


784CIP2 82 


6835 


83 


1859 


3655 


5441 


784CIP2_83 


6837 


84 


1870 


36S6 


5442 


784CIP2_84 


6843 


85 


1871 


3657 


5443 


784CIP2 85 


6859 


86 


1872 


3658 


5444 


784C2P2 86 


6915 


87 


1873 


3659 


5445 


784CIP2_87 


6932 


68 


1874 


3660 


544 6 


784CIP2_88 


6957 


89 


1875 


3661 


5447 


784CIP2_89 


6961 


90 


1876 


3662 


544 8 


784CIP2_90 


6973 


91 


1877 


3663 


5449 


784CIP2 91 


6973 


92 


1878 


3664 


5450 


?846iP2 93 


7007 


93 


1879 


3665 


5451 


784CIP2 94 


7018 


94 


1880 


3666 


5452 


784CIP2_95 


7019 


95 


1881 


3667 


5453 


784CIP2_96 


7020 


96 


1882 


3668 


5454 


784CIP2_97 


7020 


97 


1883 


3669 


5455 


784CIP2 98 


7021 


98 


1884 


3670 


5456 


784CIP2„99 


7023 


99 


1885 


3671 


5457 


784CIP2_100 


7027 


100 


1866 


3672 


54S8 


784CIP2_101 


7028 


101 


1887 


3673 


5459 


784CIP2__102 


7029 


102 


1888 


3674 


5460 


784CIP2JL03 


7031 


103 


1889 


3675 


5461 


784CIP2_104 


7032 


104 


1890 


3676 


5462 


784CIP2_105 


7033 


105 


1891 


3677 


5463 


784CIP2_106 


7035 


106 


1892 


367B 


5464 


784CIP2_107 


7036 


107 


1893 


3679 


5465 


784CIP2_108 


7039 


108 


1894 


3680 


5466 


784CIP2_109 


7043 


109 


1895 


3681 


5467 


784CIP2_110 


7044 


110 


189£ 


3682 


5468 


784CIP2_111 


7046 


111 


1897 


36B3 


5469 


784CIP2JL12 


7054 


112 


1898 


3684 


5470 


784CIP2_113 


7061 


113 


1899 


3685 


5471 


784CIP2_114 


7077 


114 


1900 


3686 1 


5472 


784CIP2 115 


7092 


115 


1901 


" 3687 


5473 


7B4CIP2_116 


7094 


116 


1902 


3688 


5474 


784CIP2_117 


7106 


117 


1903 


3689 


5475 


784CIP2 118 


7107 


118 


1904 


3690 


5476 


784CIP2_119 


7111 


119 


1905 


3691 


5477 


784CIP2_120 


7123 


120 


1906 


3692 


5478 


784CIP2_121 


7142 


121 


1907 


3693 


5479 


784CIP2_JL22 


7142 



272 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO; 


docket number_ 


NO: in 


length 


full- 


nucleotide 


of con tig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO; in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




122 


1908 


3694 


5480 


7 84CIP2JL23 


7154 


123 


1909 


3695 


54 81 


784CIP2 124 


7160 


124 


1910 


3696 


5482 


784CIP2_125 


7169 


125 


1911 


3697 


5483 


784CIP2_126 


7185 


126 


1912 


3698 


5484 


784CIP2JL27 


7197 


127 


1913 


3699 


5485 


704CIP2_128 


7219 


128 


1914 


'3 700 


5486 


784CIP2 129 


7226 


129 


1915 


3701 


5487 


784CIP2_130 


7229 


130 


1916 


3702 


5488 


784CIP2 131 


7234 


131 


1917 


3703 


5489 


784CIP2_132 


7235 


132 


1918 


3704 


5490 


784CIP2_133 


7235 


133 


1919 


3705 


5491 


784CIP2_134 


7238 


134 


1920 


3706 


5492 


784CIP2 135 


7247 


13 S 


1921 


3707 


5493 


784CIP2_136 


7261 


136 


1922 


3708 


5494 


784CIP2 137 


7262 


137 


1923 


3709 


5495 


784CIP2_13 8 


7267 


138 


1924 


3710 


54 96 


784CIP2 139 


7272 


139 


1925 


3711 


S497 


784CIP2 140 


7273 


140 


1926 


3712 


5498 . 


784CIP2_141 


7282 


141 


1927 


3713 


5499 


784CIP2_142 


7288 


142 


192B 


3714 


" £500 


784CIP2 143 


7291 


143 


1929 


3715 


5501 


784CIP2 144 


7293 


144 


1930 


3716 


• SS02 


784CIP2__145 


7294 


14 S 


1931 


3717 


5503 


784CIP2_146 


7299 


146 


1932 


3718 


5504 


784CIP2_14 7 


7300 


147 


1933 


3719 


5505 


784CIP2_148 


7312 


148 


1934 


3720 


5506 


784CI?2_149 


7313 


14 9 


1935 


3721 


5507 


784CIP2_150 


7315 


150 


1936 


3722 


55C8 


784CIP2_151 


7318 


151 


1937 


3723 


5509 


784CIP2 152 


7321 


152 


1938 


3724 


5510 


784CIP2 153 


7330 


153 


1939 


3725 


5511 


784CIP2_154 


7331 


154 


1940 


3726 


5512 


784CIP2_JL55 


7333 


1S5 


1941 


3727 


5513 


784CIP2_156 


73S0 


1S6 


1942 


3728 


5514 


784CIP2 157 


7352 


157 


1943 


3729 


5515 


784CIP2_158 


7384 


158 


1944 


3730 


5516 


784CIP2_159 


7403 


159 


1945 


3731 


5517 


784CIP2_160 


7431 


160 


1946 


3732 


5518 


784CIP2 161 


7441 


161"" 


1947 


3733 


5519 


784CIP2_162 


7453 


162 


1948 


3734 


5520 


784CIP2_163 


7467 


163 


1949 


3735 


5521 


784CIP2_164 


7471 


164 


1950 


3736 


5522 


784CIP2JL65 


7493 


165 


1951 


3737 


5523 


784CIP2 166 


7502 


166 


1952 


3 73 8 


5524 


784CIP2_167 


7511 


167 


1953 


3739 


5525 


784CIP2 168 


7514 


168 


1954 


3740 


5526 


784CIP2JL69 


7520 


169 


1955 


3741 


5527 


784CIP2 170 


7541 


170 


1956 


3 742 


5528 


784CIP2_171 


7570 


171 


1957 


3743 


5529 


784CIP2_172 


7578 


172 


1958 


3 744 


5530 


784CIP2 173 


7583 


173 


1959 


3745 


5531 


784CIP2 174 


7592 


174 


1960 


3746 


5532 


784CIP2 175 


7601 


175 


1961 


3747 


5533 


784CIP2 176 


7602 


176 


1962 


374B 


5534 


784CIP2_177 


7608 


177 


1963 


3749 


5535 


784CIP2_178 


7615 


178 


1964 


3750 


5536 


784CIP2_179 


7617 


179 


1965 


3 751 


5537 


784CIP2 181 


7624 


180 


1966 


3752 


5538 


7B4CIP2 182 


7626 


181 


1967 


3753 


5539 


784CIP2 183 


7640 


182 


1968 


3754 


5540 


784CIP2 184 


7641 


183 | 


1969 


3755 


5541 


784CIP2 185 


7641 
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of full- 


SEQ ID 
WO : of 


SEO ID NO* 

of contig 


SEQ ID 
NO*. 


Priority 
docket number^ 


•SEQ ID 
NO: in 


length 
nucleotide 


full- 
length 


nucleotide 
sequence 


of contig 
peptide 


corresponding 
SEQ ID NO: in 


U.S. S.N. 
09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




184 


1970 


3756 


5542 


784CIP2_186 


7641 


185 


1971 


3757 


5543 


7 84CIP2_187 


7642 


186 


1972 


3758 


5544 


784CIP2_188 


7649 


187 


" 1973 


3759 


5545 


7 84CIP2_189 


7656 


188 


1974 


3760 


5546 


784CIP2_190 


7657 


189 


1975 


3761 


5547 


7 84CIP2_191 


7657 


190 


1976 


3762 


5548 


784CIP2 192 


7662 


191 


1977 


3763 


5549 


784CIP2_193 


766B 


192 


1978 


3764 


5550 


784CIP2_194 


7673 


193 


1979 


3765 


5551 


7S4CIP2_195 


7690 


194 


1980 


3766 


5552 


784CIP2JL96 


7700 


195 


1981 


3767 


5553 


784CIP2JL97 


7709 


196 


1982 


3768 


5554 


784CIP2_19 8 


7736 


197 


1983 


3769 


5555 


784CIP2_199 


7737 


198 


1984 


3770 


5556 


7 84CIP2_2 00 


7744 


199 


1985 


3771 


5557 


784CIP2_201 


7771 


200 


1986 


3772 


5558 


784CIP2_202 


7786 


201 


1987 


3773 


5559 


784CIP2_203 


7791 


202 


1988 


3774 


5560 


784CIP2_204 


7797 


203 


1989 


3775 


5561 


784CIP2 205 


7806 


204 


1990 


3776 


5562 


784CIP2 206 


7812 


205 


1991 


3777 


5563 


784CIP2_20 7 


7812 


206 


1992 


3778 


5564 


784CIP2_208 


7818 


207 


1993 


3779 


5565 


784CIP2_209 


7822 


208 


1994 


3780 


5566 


784CTP2_210 


7827 


209 


1995 


3781 


5567 


784CIF2_211 


7830 


210 


199S 


3782 


5568 " 


784CIP2_212 


7835 


211 


1997 


3783 


5569 


784CIP2_214 


7840 


212 


1998 


3784 


5570 


784CIP2_215 


7858 


213 


1999 


3785 


5571 


784CIP2_216 


7858 


214 


2000 


3786 


5572 


784CIP2_217 


7861 


215 


2001 


3787 


S573 


784CIP2_218 


7866 


216 


2002 


3788 


5574 


784CIP2_219 


7866 


217 


2003 


3789 


5575 


784CIP2_220 


7896 


218 


2004 


3790 


5576 


784CIP2_221 


7898 


219 


2005 


3791 


5577 


784CIP2_222 


7900 


220 


2006 


3792 


5578 


784CIP2 223 


7906 


221 


2007 


3793 


5579 


784C±IP2_224 


7908 


222 


2008 


3794 


5580 


784CIP2_225 


7909 


223 


2009 


3795 


5581 


784CIP2_226 


7917 


224 


2010 


3796 


5582 


784CIP2_227 


7932 


225 


2011 


3797 


5583 


784CIP2_22B 


7940 


226 


2012 


3798 


5584 


784CTP2_229 


7940 


227 


2013 


3799 


5585 


784CIP2_23 0 


7984 


228 


2014 


3800 


5586 


784CIP2_231 


7984 


229 


2015 


3801 


5587 


784CIP2_232 


8001 


230 


2016 


3802 


5588 


784CIP2_233 


8021 


231 


2017 


3803 


5589 


784CIP2_234 


6029 


232 


2018 


3804 


5590 


784CIP2_235 


8033 


233 


2019 


3805 


5591 


784CIP2_23 6 


8040 


234 


2020 


3806 


5592 


784CIP2 237 


8052 


235 


2021 


3807 


5593 


784CIP2_238 


8096 


236 


2022 


3808 


5594 


784CIP2 23 9 


8096 


237 


2023 


3809 


5595 


784CIP2_24 0 


8113 


238 


2024 


3810 


5596* 


784CIP2 241 


8126 


239 


2025 


3811 


5597 


784CIP2_242 


8132 


240 


2026 


3812 


5598 


784CIP2_243 


8137 


241 


2027 


3813 


5599 


784CIP2_244 


8137 


242 


2028 


3814 


5600 


784CIP2 245 


8159 


243 


2029 


3815 


5501 


784CIP2 246 


8159 


244 


2030 


3816 


5602 


784CIP2 247 


8161 


245 


2031 


3817 


5603 


784CIP2 - 248 


8176 
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SEQ ID NO: 

OH IU11- 


SEQ ID 
NO : of 


O Cj\1 ±Lf IS\J , 


SHQ ID 
NO : 


Priority 


SEQ ID 
NO: in 


lsnstii 


f ull- 




of confcicj 


c orre spond i ncf 


U.S. S.N. 


niicl b o t i 






peptide 


SEQ ID NO: in 


09/488, 725 


secjuencs 


peptide 
sequence 




sequence 


priority 
application 




246 


2032 


3818 


5604 


784CIP2_249 


B196 


247 


2033 


3819 


5605 


784CIP2_250 


8200 


248 


2034 


3820 


5606 


784CIP2_2S1 


8212 


249 


2035 


3821 


5607 


784CIP2_252 


8220 


2S0 


2036 


3822 


5608 


784CIP2_253 


8238 


251 


2037 


3823 


5609 


784CIP2_254 


8254 


2S2 


2038 


3824 


5610 


784CIP2_2S5 


8255 


253 


2039 


3825 


5611 


784CIP2_256 


8288 


254 


2040 


3826 


5612 


7B4CIP2_257 


8296 


255 


2041 


3827 


5613 


784CIP2 258 


8329 


25S 


2042 


3828 


5614 


784CIP2 259 


8362 


257 


2 043 


3829 


5615 


784CIP2 260 


8429 


258 


2044 


3830 


5616 


784CIP2_261 


8436 ' 


259 


2045 


3831 


5617 


784CIP2_262 


8448 


260 


2046 


3832 


5618 


784CIP2J263 


8472 


2S1 


2047 


3833 


5619 


784CIP2 264 


8502 


262 


2048 


3834 


5620 


784CIP2_265 


8504 


263 


2049 


3835 


5621 


784CIP2_266 


8507 


264 


2050 


3836 


5622 


784CIP2_268 


8509 


265 


2051 


3837 


5623 


784CIP2 269 


8515 


266 


2052 


3838 


5624 


784CIP2_270 


8519 


267 


2053 


3839 


5625 


784CIP2_271 


8530 


268 


2054 


3B40 


5626 


784CIP2_272 


8532 


269 


2055 


3841 


5627 


784CIP2 273 


8532 


270 


2056 


3842 


5628 


784CIP2_274 


B539 


271 


2057 


3B43 


5629 


784CIP2_275 


8541 


272 


2058 


3644 


5630 


784CIP2 276 


854 3 


273 


2059 


3845 


5631 


784CIP2 277 


S593 


274 


2060 


3846 


5632 


784CIP2 278 


8595 


275 


2061 


3 847 


5633 


784CIP2 279 


8615 


276 


2 062 


3848 


5634 


764CIP2 280 


8620 


Oil 


2 063 


384 9 


5635 


784CIP2 281 


8621 


4. to 


2064 


3850 


' " 5636 


784CIP2 282 


8623 


& /y 




3851 


563 7 


784CIP2 283 


8625 


o on 


2066 


3 852 


563 8 


784CIP2 284 


8628 


£Ol 


2067 


3 853 


5639 


784CIP2 285 


8628 


ADZ 


2068 


3 B 54 


5640 


""'784CTP2 286 


8629 


283 


2069 


3 855 


5641 


784CIP2 287 


8630 


284 


2070 


3856 


5642 


784CIP2_288 


8631 


265 


2071 


3857 


5643 


784CIP2_289 


8633 


286 


2072 


3858 


5644 


784CIP2 290 


■ 8634 


287 


2073 


3 859 


5645 


784CIP2_291 


8635 


288 


2074 


3860 


5646 


784CIP2_292 


8636 


289 


2075 


3861 


5647 


784CIP2_293 


8659 


290 


2076 


3862 


5648 


784CIP2J294 


8660 


291 


2077 


3863 


5649 


784CIP2_295 


9667 


292 


2078 


3664 


5650 


784CIP2_296 


8667 


293 


2079 


3865 


5651 


784CIP2_2S7 


B685 


294 


2080 


3866 


5652 


784CIP2J29B 


8805 


295 


2081 " 


3867 


5653 


784CIP2 299 


8896 


296 


2082 


3868 


5654 


784CIP2_300 


8978 


297 


2083 


3869 


5655 


784CIP2JJ01 


9046 


298 


2084 


3870 


5656 


784CIP2_302 


9048 


299 


2085 


3871 


5657 


784CIP2 303 


9116 


300 


2086 


3872 


5658 


784CIP2_304 


9195 


301 


2087 


3873 


5659 


784CIP2_305 


9201 


302 


2088 


3874 


5660 


784CIP2_306 


9307 


303 


2089 


3875 


5661 


784C2P2 - 307 


9321 


304 


20SO 


3876 


5662 


784C1P2_308 


9397 


305 


2091 


3877 


5663 


784CIP2_309 


9405 


306 


2092 


3878 


5664 


784CIP2_31Q 


9406 


307 


2093 


3879 


5465 


784CIP2_311 


9422 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
f ull- 
length * 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 


SEQ ID 
NO : 

of contig 
t5#»ift t ^ At* 
sequence 


Priority 

QOCK6 l~ li LillUJ gij- 

cor responding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO : in 
U.S. S.N. 
09/488,725 


30 8 


2034 


3880 


5666 


784CIP2 - 312 


9494 


309 


2095 


3881 


5667 


784CIP2_313 


9512 


310 


2096 


3882 


5668 


784CIP2 314 


9632 


311 


2097 


3883 


5669 


784CIP2_315 


9661 


312 


2098 


3884 


5670 


" 784CIP2_316 


9664 


313 


2099 


3885 


5671 


784CIP2 317 


9691 


314 


2100 


3886 


5672 


" 784CIP2_31B 


9700 


315 


2101 


3887 


5673 


784CIP2 319 


9716 


JXD 


2102 


3888 


5674 


784CIP2_320 


9721 


317 


2103 


3889 


5675 


784CIP2_321 


9870 


318 


on DA 


3890 


5676 


784CIP2 322 


9887 


319 


o 1 nc 


3891 


5677 


784CIP2 323 


9923 


320 


2106 




5678 


784CIP2 324 


9938 


321 


2107 


3893 


5679 


784CIP2 325 


9964 


322 


2108 


3 894 


5680 


784CIP2 326 


10007 


323 


2109 


3895 


cent 


784CIP2 327 


10009 


324 


2110 


3896 


5682 


1 7fl4.CTP9 328 


10046 


325 


2111 


3897 


5683 




10156 


326 


2112 


3898 




TOAPTDO i "in 


10276 


327 


2113 


3899 


5685 


OQ/l.r'TDO 1*11 
/ O ft*— J. Jfj£ OJl 


10283 


328 


2114 


3900 


5686 


Tn^r'TTJiOH T 


152 


329 


2115 


3901 


5687 


TQAPTMR O 


167 


330 


2116 


3902 


5688 


to** V & n -J 


205 


331 


2117 


3903 


5689 




210 


332 


2118 


3904 


boy u 




225 


333 


2119 


3905 


5691 




226 


334 


2120 


3906 


5692 


*7nAPTD0*R "7 
/ o*s V—JL lr^.O / 


264 


335 


2121 


3907 




7R4riP2B 8 

/ O ^ JL Jtr ^ -1^> q 


268 


336 


2122 


3908 




784CIP2B 9 


293 


337 


2123 


3909 


5695 


784GIP2B 10 


293 


338 


2124 


3910 


5696 


784CIP2B 11 


293 


339 


2125 


j 711 


5697 


784CIP2B 12 


302 


340 


2126 


TqTo 


5698 


784CIP2B 13 


311 


341 


2127 


ion 


5699 


784CIP2B 14 


352 


342 


2128 


3914 


5700 


784CIP2B_15 


358 


343 


2129 


3 915 


5701 


784CIP2B_16 


368 


344 


omn 
/uu 


3916 


5702 


784CIP2B 17 


393 




213 1 


3917 


5703 


784CIP2B 18 


477 


346 


on to 


3918 


5704 


784CIP2B__19 


508 


3 ah 

J*± f 


2133 


3919 


5705 


784CIP2B_20 ' 


508 


348 


O 1 1 A 


3920 


5706 


784CIP2B_21 


515 


i/i a 


0."135 


3921 


S707 


784CIP2B_22 


578 




2136 


3922 


5708 


784CIP2B_23 


588 


1 c*l 


2 137 


3923 


5709 


784CIP2B 24 


591 


Jj£ 


2138 


3924 


5710 


784CIP2B_25 


593 




2139 


3925 


5711 


784CIP2B_26 


594 


354 


Ol AH 


3926 


5712 


784CIP2B_27 


619 "" 


355 


O 1 AT 


3927 


5713 


784CIP2B_28 


620 


356 


2142 




5714 


784CIP2B 29 


f&4 


357 


2143 


1 OO Q 


5715 


784CIP2B 30 


692 


358 


2144 


0 y j u 


5716 


784CIP2B 31 


753 


359 


2145 


3931 


5717 


784CIP2B_32 


758 


360 


2146 


3932 


5718 


784CIP2B_33 


787 


3eJi 


2147 


3933 


S719 


784CIP2B 34 


833 


362 


2148 


3934 


5720 


784CIP2B_3S 


838 


363 


2149 


3935 


5721 


7B4CIP2BJJ6 


870 


364 


2150 


3936 


5722 


784CIP2B 37 


891 


365 


2151 


3937 


5723 


784CIP2B 38 


891 


3^ 


2152 


3938 


5724 


784CIP2B_39 


921 


367 


2153 


3939 


5725 


784CIP2B 40 


924 


368 


2154 


3940 


5726 


784CIP2B 41 


932 


369 


2155 


3941 


5727 


784CIP2B 42 


942 
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SEQ ID NO : 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
D.S-S »N. 
09/488,725 


370 


2156 


3942 


5728 


784CIP2B 43 


958 


371 


2157 


3943 


5729 


7 84CIP2B_44 


968 


372 


2158 


3944 


5730 


784CIP2B_45 


992 


373 


2159 


3945 


5731 


784CIP2B 46 


1025 


374 


2160 


3946 


5732 


784CIP2B_47 


1074 


375 


2161 


3947 


5733 


784CIP2B_48 


1104 


376 


2162 


3948 


5734 


784CIP2B_4 9 


1114 


377 


2163 


3949 


5735 


784CIP2B 50 


1144 


378 


2164 


3950 


5736 


784CIP2B_51 


1262 


379 


2165 


3951 


5737 


784CIP2B_S2 


1318 


380 


2166 


3952 


5738 


784CrP2B_53 


1319 


381 


2167 


3953 


5739 


784CIP2B_54 


1328 


382 


2168 


3954 


574 6" 


784CIP2B_55 


1436 


383 


2169 


3955 


5741 


784CIP2B_56 


1464 


384 


2170 


3956 


5742 


784CIP2B_57 


1584 


385 


2171 


3957 


574 3 


784CIP2B_58 


1617 


386 


2172 


3958 


5744 


784CIP2B__59 


1724 


387 


2173 


3959 


5745 


784CIP2B_60 


1728 


388 


2174 


3960 


574 6 


784CIP2B_61 


1772 


389 


2175 


3961 


5747 


784CIP2B 62 


1809 


390 


217* 


3962 


5748 


784CIP2B_63 


1868 


391 


2177 


3963 


5749 


784CIP2B_64 


1898 


392 


2178 


3964 


5750 


784CIP2B_65 


1926 


393 


2179 


3965 


5751 


7S4CIP2B 66 


1965 


394 


2180 


3966 


5752 


784CIP2B_6 7 


1967 


39S 


2181 


3967 


5753 


784CIP2B_68 


1995 


396 


2182 


3968 


5754 


784CIP2B_69 


2005. 


397 


2183 


3969 


5755 


784CIP2B 70 


2027 


398 


2184 


3970 


5756 


784CIP2B_71 


2055 


399 


2185 


3971 


- "575*7 


784CIP2B 72 


2103 


400 


2186 


3972 


5758 


784CIP2B_73 


2106 


401 


2187 


3973 


5759 


784CIP2B 74 


2166 


402 


2188 


3974 


5760 


784CIP2B_-7S 


2175 


403 


2189 


3975 


5761 


784CIP2B_76 


2176 


404 


2190 


3976 


5762 


784CIP2B 78 


2236 


405 


2191 


3977 


5763 


784CIP2B_79 


2250 


406 


2192 


3978 


5754 


784CIP2B 80 


2300 . 


407 


2193 


3979 


• 5765 


784CIP2B_81 


2323 


406 


2194 


3980 


57S6 


784CIP2B_82 


2340 


409 


2195 


3981 


5767 


784CIP2B_83 


2371 


410 


2196 


3982 


5768 


784CIP2B_84 


2399 


411 


2197 


3983 


5759 


784CIP2B_85 


2411 


412 


2198 


3984 


5770 


784CIP2B_86 


2428 


413 


2199 


3985 


5771 


7B4CIP2B_87 


2430 


414 


2200 


3986 


5772 


784CIP2B 88 


2439 


415 


2201 


3987 


5773 


784CIP2B_89 


2447 


416 


2202 


3988 


5774 


784CIP2B 90 


2461 


417 


2203 


39B9 


5775 


784CIP2B_91 


2487 


418 


2204 


3990 


5776 


784CIP2B 92 


2492 


419 


2205 


3991 


5777 


7(UCIP2B_93 


2S12 


) 420 


2206 


3992 


5778 


784CIP2B_94 


2564 


421 


2207 


3993 


5779 


784CIP2B_95 


2678 


422 


2208 


3994 


5780 


784CIP2B_96 


2816 


423 


2209 


3995 


5781 


784CIP2B 97 


2818 


424 


2210 


3996 


5782 


784CIP2B_98 


2819 


425 


2211 


3997 


5783 


784CIP2B„99 


2943 


426 


2212 


3998 


5784 


784CIP2B 100 


3137 


427 


2213 


3999 


5785 


784CIP2B_101 


" 3137 


428 


2214 


4000 


5786 


784CIP2B 102 


3160 


429 


2215 


4001 


5787 


784CIP2B 103 


3323 


430 


2216 


4002 


5788 


7S4CIP2B_104 


3360 


431 


2217 


4003 


5789 


784CIP2B_JL0S 


3362 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority- 


SEQ ID 


of full- 


KfO: of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 
sequence 




sequence 


priority- 
application 




432 


221S 


4004 


5790 


784CIP2B_106 


3417 


433 


2219 


4005 


5791 


784CIP2B_10 7 


3418 


434 


2220 


4006 


5792 


784CTP2B_10 8 


3442 


435 


2221 


4007 


5793 


784CIP2B_109 


3442 


436 


2222 


4008 


5794 


784CIP2B 110 


3444 


437 


2223 


4009 


5795 


784CIP2B_111 


3855 


438 


2224 


4010 


5796 


784CIP2B_112 


3863 


439 


2225 


4011 


5797 


784CIP2B_113 


4090 


440 


2226 


4012 


5798 


784CIP2B_114 


4105 


441 


2227 


4013 


5799 


784CIP2B_115 


4142 


442 


2228 


4014 


5800 


784CIP2B_116 


4142 


443 


2229 


4015 


5801 


784CIP2B_117 


4149 


444 


2230 


4016 


5802 


784CIP2B 118 


4196 


445 


2231 


4017 


5503 


784CIP2B_119 


4202 


446 


2232 


4018 


5804 


784CIP2B_120 


4274 


447 


2233 


4019 


5805 


784CIP2B_121 


4304 


448 


2234 


4020 


5806 


784CIP2B_122 


4306 


445 


223S 


4021 


5S07 


784CIP2B 123 


4311 


450 


2236 


4022 


5808 


784CIP2B_124 


4321 


451 


2237 


4023 


5809 


784CIP2B_12S 


4323 


452 


2238 


4024 


5810 


784CIP2B_126 


4332 


453 


2239 


4 025 


5811 


784CIP2B_127 


4488 


454 


2240 


4 026 


5812 


784CIP2B_128 


4588 


455 


2241 


4027 


5813 


784CIP2B_129 


5569 


456 


2242 


4 028 


5814 


784CIP2B_130 


55-73 


457 


2243 


4029 


5815 


7B4CIP2B_131 


5577 


458 


2244 


4030 


5816 


7B4CIP2B 132 


5579 


459 


2245 


4031 


5817 


784CIP2B_133 


5582 


460 


2246 


4032 


5818 


784CIP2B_134 


5583 


461 


2247 


4033 


5819 


784CIP2B_135 


5584 


462 


2248 


4034 


5820 


784CIP2B_13 6 


5565 


463 


2249 


4035 


5821 


784CIP2B_137 


5591 


464 


2250 


4036 


5822 


784CIP2B_138 


5593 


465 


2251 


4037 


*a£3 


784CIP2B 139 


5594 


466 


2252 


4038 


5824 


784CIP2B_140 


5594 


467 


2253 


4039 


5825 


784CIP2B_141 


5598 


468 


2254 


4040 


5826 


784CIP2B_142 


5602 


469 


2255 


4 041 


5827 


V84CIP2B_143 


5605 


470 


225S 


4042 


5828 


784CIP2B_144 


5608 


471 


2257 


4043 


5829 


784CIP2B_145 


5517 


472 


2258 


4 044 


5830 


784CIP2B_146 


5620 


473 


2259 


4045 


5831 


784CIP2BJL47 


5622 


474 


2260 


4046 


5832 


784CIP2B_14 8 


5623 


475 


2261 


4047 


5833 


784CIP2B_149 


5624 


476 


2262 


4048 


5834 


784CIP2B_150 


5625 


477 


2263 


4049 


5835 


784CIP2B_151 


5627 


478 


2264 


4050 


5836 


784CIP2B_JL52 


5628 


479 


• 2265 


4051 


5837 


784CIP2B_153 


5630 


480 


2266 


4052 


5838 


784CIP2B_154 


5632 


4B1 


2267 


4053 


5839 


7 84CIP2B_155 


5640 


482 


2268 


4054 


5840 


7 84CIP2B_156 


S641 


483 


2269 


4 055 


5841 


7S4CIP2B_157 


5643 


484 


2270 


4056 


5842 


784CIP2B_158 


5647 


485 


2271 


4057 


5843 


784CIP2B_159 


5649 


486 


2272 


4058 


5844 


784CIP2B_160 


S658 


487 


2273 


4059 


5845 


784CIP2B 161 


5659 


488 


2274 


4060 


5846 


784CIP2B 162 


5667 


489 


2275 


4061 


5847 


784CIP2BJL63 


5672 


490 


2276 


4062 


5848 


784CIP2B_164 


5674 




2277 


4063 


5849 


784CIP2B 165 


5678 


492 


2278 


4064 


5B50 


784CIP2B_166 


5680 


493 


2279 


4065 


5851 


784CIP2B 167 


5684 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO; 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




494 


2230 


4066 


5852 


784CTP2B_I68 


5686 


495 


2261 


4067 


5853 


784CIP2B_169 


5694 


496 


2282 


4068 


5854 


784CIP2B 170 


5698 


497 


2283 


4069 


5855 


784C1P2B 171 


5699 


49 8 


2284 


4070 


5856 


784CIP2B_172 


5712 


499 


2285 


4071 


5857 


784CIP2B_173 


5719 


500 


2286 


4072 


5858 


784CIP2B 174 


5720 


501 


2287 


4073 


5859 


784CIP2B_17S 


5727 


502 


228S 


4074 


5860 


784CIP2B 176 


5730 


503 


2289 


4075 


5861 


784CIP2B_177 


5734 


504 


2290 


4076 


5862 


784CIP2B 178 


5738 


505 


2291 


4077 


5863 


784CIP2B 179 


5739 


506 


2292 


4078 


5B64 


784CIP2B 180 


5740 


507 


2293 


4079 


5865 


784CTP2B_1B1 


5744 


508 


2294 


4080 


5866 


784CIP2B_182 


5748 


509 


2295 


4081 


5867 


784CIP2B 183 


5749 


510 


2296 


4082 


5868 


784CIP2B 184 


5750 


511 


2297 


4083 


5869 


784CIP2B_185 


5750 


512 


2298 


4084 


5870 


784CIP2B 186 


5750 


513 


2299 


4085 


5B71 


784CIP2BJL87 


5761 


514 


2300 


4086 


5872 


784CIP2B 188 


5762 


515 


2301 


4087 


5873 


784CIP2B 189 


5767 


516 


2302 


4088 


5874 


7B4CIP2B_190 


5773 


517 


2303 


4089 


5875 


784CIP2B 191 


5783 


518 


2304 


4090 


5876 


784CIP2B 192 


5784 


519 


2305 


4091 


5877 


784CIP2B 193 


5788 


520 


2306 


4092 


5878 


784CIP2B 194 


5798 


521 


2307 


4093 


5879 


784CIP2B 196 


5807 


522 


2308 


4094 


5880 


784CIP2B_197 


5818 


523 


2309 


4095 


5881 


784CIP2B 198 


5819 


524 


2310 


4096 


58S2 


784CIP2B_199 


5827 


525 


2311 


4097 


5883 


784CIP2B_200 


5828 


526 


2312 


409B 


5884 


784CIP2B_201 


5842 


527 


2313 


4099 


5885 


784CIP2B_202 


5853 


528 


2314 


4100 


5886 


784CIP2B_203 


5861 


529 


2315 


4101 


5887 


784CIP2B_204 


5864 


530 


2316 


4102 


5888 


784CIP2B_20S 


5865 


531 


2317 


4103 


5889 


784CIP2B_206 


5871 


532 


2318 


4104 


5890 


784CIP2B 207 


5873 


533 


2319 


4105 


5891 


784CIP2B_208 


5873 


534 


2320 


4106 


5892 


7 84CIP2B_209 


5875 


535 


2321 


4107 


5893 


7 84CIP2B_210 


5878 


536 


2322 


4108 


5894 


784CIP2B 211 


5879 


53 7 


2323 


4109 


5895 


7 84CIP2B_212 


5880 


53 B 


2324 


4110 


5896 


784CIP2B_213 


5880 


con 


2325 


4111 


5897 


784CIP2B_ 214 


5880 




2326 


4112 


5898 


784CIP2B 215 


5880 


541 


2327 


4113 


5899 


784CIP2B_216 


5885 


542 


£JZ o 


4114 


5900 


784CIP2B 217 


5895 




2329 


4115 


5901 


784CIP2B_21B 


5898 


544 


2330 


4116 


5902 


784CIP2B_219 


5902 


cic 


23 31 


4117 


5903 


784CIP2B_220 


5904 


546 


2332 


4118 




/ 0*t\~± c £, U *2 £. i 




547 


2333 


4119 


5905 


7B4CIP2B_222 


5921 


548 


2334 


4120 


5906 


784CIP2B 223 


5927 


549 


2335 


4121 


5907 


784CIP2B_224 


5932 


550 


2336 


4122 


5908 


784CIP2B_225 


5939 


551 


2337 


4123 


5909 


784CIP2B 226 


5945 


552 


2338 


4124 


5910 


784CIP2B 227 


5946 


553 


2339 


4125 


5911 


784CIP2B 228 


5947 


554 


2340 


4126 


5912 


784CIP2B_229 


5956 


555 


2341 


4127 


5913 


784CIP2B 230 


5967 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priori ty 


SEQ ID 


of full- 


NO: of 


of con tig 


NO: 


docket nuniber_ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




556 


2342 


4128 


5914 


784CIP2B_232 


5975 


557 


2343 


4129 


5915 


784CIP2B_233 


5977 


S58 


2344 


4130 


5916 


7B4CIP2B 234 


5978 


559 


2345 


4131 


5917 


7B4CIP2B_235 


5979 


560 


2346 


4132 


5918 


784CIP2B 236 


5980 


561 


2347 


4133 


5919 


784CIP2B_237 


5988 


562 


2346 


4134 


592 0 


7B4CIP2B 238 


5989 


563 


2349 


4135 


5921 


784CIP2B_239 


5991 


564 


2350 


4136 


5922 


784CIP2B 240 


5997 


565 


2351 


4137 


5923 


784CIP2B_241 


5998 


566 


2352 


4138 


5924 


784CIP2B 242 


6003 


567 


2353 


4139 


5925 


784CIP2B 243 


6004 


568 


2354 


4140 


5926 


784CIP2B__244 


6013 


569 


2355 


4141 


5927 


784CIP2B 245 


6028 


570 


2356 


4142 


5928 


784CIP2B 246 


6028 


571 


2357 


4143 


5929 


784CIP2B 247 


6029 


572 


2358 


4144 


5930 


784CIP2B_248 


6031 


573 


2359 


4145 


5931 


784CIF2B_249 


6031 


574 


2360 


4146 


5932 


784CIP2B 250 


6032 


57S 


2361 


4147 


5933 


784CIP2B_251 


6037 


576 


2362 


4148 


5934 


784CIP2B 252 


6037 


577 


2363 


4149 


593 5 


7B4CIP2B 253 


6043 


578 


2364 


4150 


5936 


784CIP2B 254 


6044 


S79 


2365 


4151 


5937 


784CIF2B_2 55 


6046 


580 


2366 


4152 


5938 


784CIP2B__2S6 


6048 


581 


2367 


4153 


5939 


784CIP2B_257 


6049 


582 


2368 


4154 


5940 


784CIP2B 258 


6051 


583 


236$ 


4155 


5941 


784CIP2B_2S9 


6053 


584 


2370 


4156 


5942 


784CIP2B 260 


6060 


585 


2371 


4157 


5943 


784CIP2B 261 


6063 


586 


2372 


4158 


5944 


784CIP2B_262 


6066 


587 


2373 


4159 


5945 


784CIP2B 263 


6067 


588 " 


2374 


4160 


5946 


784CIP2B_264 


6068 


589 


2375 


4161 


5947 


784CIP2B 265 


6073 


590 


2376 


4162 


5948 


784CIP2BJ2 66 


6076 


S91 


2377 


4163 


5949 


784CIP2B 267 


6076 


592 


2378 


4164 


5950 


784CIP2B 268 


6077 


593 


2379 


4165 


5951 


784CIP2B_269 


6079 


594 


2380 


4166 


5952 


784CIP2B_270 


6082 


595 


2381 


4167 


5953 


784CIP2B 2 72 


6088 


596 


2382 


4168 


59S4 


784CIP2B_273 


6091 


597 


2383 


4169 


5955 


784CIP2B_274 


6094 


598 


2384 


4170 


5956 


784CIP2B_275 


6101 


599 


2385 


4171 


5957 


784CIP2B 276 


6103 


600 


2386 


4172 


5958 


784CIP2B 277 


6104 


601 


2387 


4173 


5959 


784CIP2B_278 


6108 


602 


2388 


4174 


5960 


784CIP2B_279 


• 6112 


603 


2389 


4175 


5961 


784CIP2B_280 


6121 


604 


2390 


4176 


5962 


784CIP2B 281 


6125 


605 


2391 


4177 


5963 


784CIP2B_282 


6126 


606 


2392 


4178 


59S4 


784CIP2B_283 


6128 


607 


2393 


4179 


5955 


784CIP2BJ284 


6129 


60S 


2394 


4180 


5966 


784CIP2B 285 


6133 


609 


2395 


4181 


5957 


784CIP2B_286 


6133 


610 


2396 


4162 


5968 


784CIP2B_287 


6135 


611 


2397 


4183 


5969 


784CIP2B 288 


6139 


612 


2398 


4184 


5970 


784CIP2B 289 


6141 


613 


2399 


4185 


5971 


784CIP2B 290 


6145 


614 


2400 


4186 


5972 


784CIP2B 291 


6146 


615 


2401 


4187 


5973 


784CIP2B_292 


6148 


616 


2402 


4188 


5974 


784CIP2B 293 


6149 


617 


2403 . 


4189 


5975 


784CIP2B 294 


6149 
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WO 01/53312 



PCT/US00/342G3 



SSQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket humber__ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


618 


2404 


4X90 


5976 


784CIP2B_295 


6153 


613 


2405 


4191 


5977 


784CIP2B_296 


6159 


£20 


2406 


4192 


5978 


784CIP2B_297 


6164 


621 


2407 


4193 


5979 


784CIP2B 298 


6167 


622 


2408 


4194 


5980 


784CIP2B_299 


6172 


623 


2409 


4195 


5981 


784CIP2B 300 


6173 


624 


2410 


4196 


5982 


784CIP2B_301 


6190 


625 


2411 


4197 


5983 


784CIP2B 302 


6194 


626 


2412 


4198 


5984 


784CIP2B_303 


6196 


627 


2413 


4199 


5985 


784CIP2B 304 


6197 


628 


2414 


4200 


S986 


784CIP2B 305 


6198 


629 


2415 


4201 


5987 


784CIP2B_306 


6198 


630 


2416 


4202 


5988 


784CIP2B_308 


6214 


631 


2417 


4203 


5989 


784CIP2B 309 


6215 


632 


2418 


4204 


5990 


784CIP2B_310 


6219 


633 


2419 


4205 


5991 


784CIP2B 311 


6226 


634 


2420 


4206 


5992 


784CIP2BJU2 


6229 


635 


2421 


4207 


5993 


784CIP2B 313 


S234 


636 


2422 


4208 


5994 


784CIP2B_314 


6237 


637 


2423 


4209 


5995 


784CIP2B 315 


6238 


638 


2424 


4210 


5996 


784CIP2B_316 


6239 


639 


2425 


4211 


5997 


784CIP2B_317 


6239 


640 


2426 


4212 


5998 


784CIP2B_318 


6239 


641 


2427 


4213 


5999 


784CIP2B_319 


6240 


642 


2428 


4214 


6000 


784CIP2B 320 


6244 


643 


2429 


4215 


6001 


784CIP2B_321 


6245 


644 


2430 


4216 


6002 


784CIP2B_322 


6250 


645 


2431 


4217 


6003 


784CIP2B 323 


6252 


646 


2432 


4218 


6004 


784CIP2B 324 


6252 


647 


2433 


4219 


6005 


784CIP2B_325 


6256 


648 


2434 


4220 


6006 


784CIP2B 326 


6260 


649 


2435 


4221 


6007 


784CIP2B_327 


6261 


650 


2436 


4222 


6008 


764CIP2B_328 


6264 


651 


243 7 


4223 


6009 


784CIP2B 329 


6265 


652 


2438 


4224 


6010 


784CIP2B_330 


6266 


653 


2439 


4225 


6011 


784CIP2BJ331 


6270 


654 


244 0 


4226 


6012 


784CIP2B 332 


6271 


655 


2441 


4227 


6013 


784.CIP2B 334 


6274 


656 


2442 


4228 


6014 


784CIP2B_335 


6276 


657 


2443 


4229 


6015 


784CIP2B_33 6 


6281 


658 


2444 


4230 


6016 


784CIP2B 337 


6281 


659 


2445 


4231 


6017 


784CIP2B 338 


6288 


660 


2446 


4232 


6013 


784CIP2B 339 


6292 


661 


2447 


4233 


6019 


784CIP2B 340 


6294 


662 


2448 


4234 


6020 


784CIP2B_343 


6312 


663 


2449 


4235 


6021 


784CIP2B 344 


6312 


CCA 


2450 


4236 


6022 


784CIP2B 345 


6312 


665 


2451 


423 7 


6023 


784CIP2B_346 


6322 


ec <; 
ooo 


2452 


4238 


6024 


784CIP2B 347 


6324 


Gen ' 

DO / 


2453 


4239 


6025 


784CIP2B 349 


6329 


ceo 
boo 


2454 


424 0 


6026 


784CIP2B 350 


6331 


669 "" " 


2455 


4241 


6027 


784CIP2B 351 


6333 


670 


2456 




Cf\*>G 

dUao 


784CIP2B 352 


6334 


671 


2457 


4243 


6029 


784CIP2B 353 


6337 


672 


2458 


4244 


6030 


784CIP2B 354 


6339 


673 


2459 


4245 


6031 


764CIP2B_355 


6346 


674 


2460 


4246 


6032 


784CIP2B 355 


6348 


675 


2461 


4247 


6033 


7B4CIP2B 357 


£348 


676 


2462 


4248 


6034 


784CIP2B 358 


6350 


677 


2463 


4249 


6035 


784CIP2B 359 


6351 


678 


2464 


4250 


6036 


784CIP2B_360 


6355 


679 


2465 


4251 


6037 


784CIP2B_361 j 6362 
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WO 01/53312 



PCT/USOO/34263 



SEQ ID NO: 


SEQ ID 


SEQ ID NO; 


SEQ ID 


Priority- 


SEQ ID 


of full- 


NO: Of 


of contig 


NO: 


docket number_ 


NO; in 


length 


full- 


nucleotide 


of contig 


corre sponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




680 


2466 


42S2 


6038 


784CIP2B 362 


6368 


fool 


2467 


4253 


6039 


784CIP2B 363 


6369 


682 


2468 


4254 


6040 


784CIP2B 364 


6371 


683 


2469 


4255 


6041 


784CIP2B 365 


6376 


t> o4 


2470 


4256 


6042 


784CIP2B 366 


6379 


boo 


2471 


4257 


6043 


784CIP2B_367 


6380 


686 


2472 


4258 


6044 


784CIP2B 368 


6381 


cm 

DO/ 


2473 


42 59 


6045 


784CIP2B 369 


6392 


boo 


2474 


4260 


6046 


784CIP2B 370 


6395 




2475 


4261 


6047 


784CIP2B_3 71 


6397 


690 


2476 


4262 


6048 


784CIP2B 372 


6400 


691 


2477 


4263 


6049 


784CIP2B 373 


6401 


692 


2478 


4264 


6050 


704CIP2B_374 


6411 


693 


2479 


4265 


6051 


784CIP2B 375 


6411 


694 


2480 


4266 


6052 


784CIP2B_376 


6411 


695 


2481 


4267 


6053 


7B4CIP2B_377 


6416 


696 


2482 


4268 


6054 


784CIP2B 378 


6418 


697 


2483 


4269 


6055 


784CIP2B_379 


6422 


69B 


2484 


4270 


6056- 


784CIP2B 380 


6423 


699 


2485 


4271 


6057 


784CIP2B_381 


6426 


700 


24B6 


4272 


6058 


784CIP2B 382 


i G427 


701 


2487 


4273 


6059 


784CIP2B_383 


6428 


702 


243B 


4274 ' 


6060 


784CIP2B_384 


6429 


703 


2489 


4275 


6061 


784CIP2B 385 


6430 


704 


2490 


4276 


6062 


784CIP2B_3 86 


6432 


705 


2491 


4277 


6063 


784CIP2B 387 


6432 


706 


2492 


4278 


6064 


784CIP2B 388 


6438 


707 


2493 


4279 


6065 


784CIP2B 389 


6441 


708 


2494 


4280 


6066 


784CIP2B_390 


6446 


709 


2495 


4281 


6067 


784CIP2B 391 


64 54 


710 


2496 


4 282 


606B 


784CIP2B 392 


6459 


711 


2497 


4283 


6069 


7B4CIP2B_394 


64 61 


712 


2498 


4284 


6070 


784CIP2B_395 


6467 


713 


2499 


4285 


6071 


784CIP2B 396 


6468 


714 


2500 


4286 


6072 


784CIP2B_397 


6487 


715 


2501 


4287 


6073 


784CIP2B 398 


6491 


716 


2502 


4288 


6074 


784CIP2B_399 


6506 


717 


2503 


4289 


6075 


784CIP2B 401 


6514 


718 


2504 


4290 


6076 


784CIP2B_402 


6519 


719 


2505 


4291 


6077 


784CIP2B_403 


6521 


720 


2506 


4292 


6078 


784CIP2B 404 


6532 


721 


2507 


4293 


6079 


784CIP2B_405 


6536 


722 


2508 


4294 


6080 


7 84CIP2B 406 


6543 


723 


2509 


4295 


6081 


784CIP2B_407 


6544 


724 


2510 


4296 


6082 


7 84CIP2B_408 


654 3 


725 


2511 


4297 


6083 


784CIP2B_409 


6551 


726 


2512 


4298 


6084 


784CIP2B 410 


6551 


727 


2513 


4299 


6085 


7B4CIP2B_411 


6552 


728 


2514 


4300 


6086 


784CIP2B_412 


6554 


729 


2515 


4301 


6087 


784CIP2B_413 


6556 


730 


2516 


4302 


6088 


7 84CIP2B_414 


6560 


731 


2517 


4303 


6089 


784CIP2B_415 


6563 




2518 


4304 


6090 


784CIP2B_416 


6S64 


733 


2519 


4305 


6091 


784CIP2B_417 


6567 


734 


2520 


4306 


6092 


784CIP2B_418 


6573 


735 


2521 


4307 


6093 


784CIP2B_419 


6575 


736 


2522 


4308 


6094 


784CIP2B_420 


6577 


737 


2523 ' * 


4309 


6095 


784CIP2B_421 


6593 


738 


2524 


4310 


6096 


784CIP2B 422 


6595 


739 


2525 


4311 


6097 


784CIP2B 423 


6599 


740 


2526 


4312 


6098 


784CIP2B__424 


6625 


741 


2527 


4313 


6099 " 


784CIP2B 425 


6625 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO; of 


of contig 


NO: 


docket nutnbftr_ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




742 


2528 


4314 


6100 


784CIP2B 426 


6626 


743 


2529 


4315 


6101 


784CIP2B_427 


6630 


744 


2530 


4316 


6102 


784CIP2B_428 


6631 


745 


2531 


4317 


6103 


784CIP2B_429 


6632 


74 6 


2532 


4318 


6104 


784CIP2B 430 


6633 


747 


2533 


4319 


6105 


784CIP2B_431 


6634 


748 


2534 


4320 


6106 


784CIP2B_432 


6638 


749 


2535 


4321 


6107 


784CIP2B_433 


6641 


750 


2536 


4322 


6108 


784CIP2B_434 


6644 


751 


2537 


4323 


6109 


784CIP2B 435 


6646 


752 


2538 


4324 


6110 


784CIP2B 436 


6648 


753 


2539 


4325 


6111 


784CIP2B_437 


6*652 


754 


2540 


4326 


6112 


784CIP2B 438 


6654 


755 


2541 


4327 


6113 


784CIP2B 439 


6657 


756 


2542 


4328 


6114 


784CIP2B 440 


6658 


757 


2543 


4329 


6115 


784CIP2B_441 


6663 


758 


2544 


4330 


6116 


784CIP2B_442 


6664 


•759 


2545 


4331 


6117 


7B4CIP2B 443 


6658 


760 


2546 


4332 


6118 


7B4CIP2B_444 


6669 


761 


2547 


4333 


6119 


784CIP2B 445 


6673 


762 


2548 


4334 


6120 


784CIP2B_446 


6685 


763 


2549 


433S 


6121 


784CIP2B 447 


6687 


764 


2550 


4336 


6122 


784CIP2B_448 


6689 


765 


2551 


4337 


6123 


784CIP2B_449 


6693 


766 


25S2 


4338 


6124 


784CIP2B_450 


6698 


767 


2553 


4339 


6125 


784CIP2B 451 


6699 


768 


2554 


4340 


6126 


784CIP2B 452 


6705 


769 


2555 


4341 


6127 


784CIP2B_453 


6711 


770 


2556 


4342 


6128 


784CIP2B 454 


6713 


771 


2557 


4343 


6129 


784CIP2B_4S5 


6716 


772 


255B 


4344 


6130 


784CIP2B_456 


6725 


773 


2559 


4345 


6131 


784CIP2B_457 


6726 


774 


2560 


4346 


6132 


784CIP2B -458 


6727 


775 


2561 


4347 


6133 


784CIP2B_459 


6730 


776 


2562 


4348 


5134 


784CIP2B_460 


6730 


777 


2563 


4349 


6135 


784CIP2B_461 


6730 


778 


2564 


4350 


5136 


784CIP2B_j4 62 


6732 


779 


2565 


4351 


6137 


784CIP2B__463 


6733 


780 


2566 


4352 


6138 


784CIP2B 464 


6737 


781 


2567 


4353 


6139 


784CIP2B_4 65 


6745 


782 


2568 


4354 


6140 


784CIP2B_466 


6751 


783 


2569 


4355 


6141 


784CIP2B_j467 


6754 


784 


2570 


4356 


6142 


784CIP2B 468 


6758 


785 


2571 


4357 


5143 


784CIP2B 469 


6761 


786 


2572 


4358 


6144 


784CIP2B_470 


5765 


787 


2573 


4359 


5145 


784CIP2B_471 


6768 


TOO 

too 


2574 


4360 


5146 


784CIP2B 472 


6773 


789 




43 61 


6147 


784CIP2B 473 


6776 


*7Qf"V 


2576 


4362 


6148 


784CIP2B__474 


6796 


f ?x 


2577 


4363 


6149 


784CIP2B 475 


6798 


792 


. 2578 


4364 


6150 


784CIP2B_476 


5823 




2579 


4365 


6151 


784CIP2B 477 


5825 


794 


2580 


43 66 


6152 




D OZO 


795 


2581 


4367 


6153 


784CIP2B_479 


6839 


796 


25B2 


43GB 


6154 


784CIP"2B__4 80 


6844 


797 


2583 


4369 


6155 


784 CIP2B_4 82 


6849 


798 


2584 


4370 


6156 


784CIP2B_4 83 


6B54 


799 


2585 


4371 


6157 


784CIP2B_4 84 


6857 


800 


2586 


4372 


615B 


784CIP2B_485 


6861 


801 


2587 


4373 


6159 


784CIP2B 486 


6873 


802 


2588 


4374 


6160 


784CIP2B__487 


6875 


803 


2589 


4375 1 


6161 


784CIP2B 488 


6877 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO : 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 

■ 


Priority 
docket number 
cor r e spond i ng 
SEQ ID NO: in 
priority 
application 


SEQ ID 
WO: in 
U.S. S.N. 
09/488,725 


804 


2590 


4376 


6162 


784CIP2B 489 


6880 


805 


2591 


Aim 


6163 


784CIP2B 490 


6885 


806 


2592 


4378 


6164 


784CIP2B 491 


6890 


807 


2593 


4379 




/o4HW« 492 


6890 


808 


2594 


4380 


OlOD 


7S4CXP2B 493 


6894 


809 


2595 




bJLb / 


784CIP2B 494 


6901 


810 


2595 


4382 


6168 


784CIP2B 495 


6904 


811 


259 7 




6169 


784CIP2B 496 


6907 


812 


2598 




6170 


784CIP2B 497 


6914 


813 


OCQQ 


4385 


6171 


784CIP2B 498 


6917 


814 - 


j£OUU 


4386 


6172 


784CIP2B 499 


6923 


815 


Z D U X 


4387 


6173 


784CIP2B S00 


6929 


816 


2602 


4388 


6174 


784CIP2B 501 


6931 


O.L l 


2603 


4389 


S175 


784CIP2B 502 


6935 




2604 


4390 


6176 


784CIP2B 503 


6940 


OAS* 


2605 


4391 


5177 


784CIP2B 504 


6945 


820 


260 6 


4392 


6178 


784CIP2B__S05 


6946 


821 


2607 


4393 


6179 


784CIP2B 506 


6947 


822 


2608 


4394 


6180 


784CIP2B 507 


694 9 


823 


2609 


4395 


6181 


784CIP2B 508 


6959 


824 


2610 


4396 ■ 


6182 


784CIP2B 509 


6960 


825 


2611 


4397 


6183 


7B4CIP2B 510 


6962 


826 


2612 


4398 


6184 


784CIP2B 511 


6963 


827 


2613 


4399 


6185 


784CIP2B_512 


6967 


828 


2614 


4400 


6186 


784CIP2B 513 


6983 


823 


2615 


4401 


61B7 


784CIP2B 514 


6988 


830 


2616 


4402 


6138 


784CIP2B 515 


6996 


831 


2617 


4403 


6189 


784CIP2B 516 


70O3 


832 


2618 


4404 


6190 


784CIP2B 517 


7016 


833 


2619 


4405 


6191 


784CIP2B 518 


7017 


834 


2620 


4406 


6192 


784CIP2B__519 


7025 


83 5 


2621 


4407 


6193 


784CIP2B_520 


7025 


83 £ 


2622 


4408 


6194 


784CIP2B_521 


7025 


837 


2623 


4409 


6195 


784CIP2B_522 


7050 


838 


2624 


4410 


6196 


784CIP2B 523 


7051 






4411 


6197 


784CIP2B 524 


7055 


840 




4412 


6198 


784CIP2B 525 


7060 


841 


262 7 


4413 


6199 


784CIP2B 526 


7064 


842 


*5 C*3 O 
<£ O 


4414 


6200 . 


784CIP2B 527 


7067 


843 




4415 


6201 


784CIP2B 528 


7071 


844 


2630 


4416 


6202 


784CIP2B 529 


7072 


845 




4417 


6203 


784CIP2B 530 


7073 


846 




4418 


6204 


784CIP2B 531 


7076 


847 


2 633 


4419 


6205 


784CIP2B 532 


7088 


848 


2634 


a a*) n 


6206 


784CIP2B 533 


7089 


849 


2635 




6207 


784CIP2B 534 


7091 


550 




4422 


6208 


784CIP2B 535 


7091 


851 


2637 


4423 


6209 


784CIP2B 536 


7104 


852 




4424 


6210 


784CIP2B 537 


7105 


853 


2639 


4425 


6211 


784CIP2B 538 


7105 


854 




4426 


6212 


784CIP2B 539 


7109 


855 


2641 


4427 


6213 


'7B4CIP2B 540 


7109 


856 


2642 


442 8 




/oILiriiD 541 


7119 


857 


2643 


4429 


6215 


784CIP2B 542 


7120 


858 


2644 


4430 


6216 


784CIP2B 543 


7121 


859 


2645 


4431 


6217 


784CIP2B 544 


7126 


860 


2646 


4432 


6218 


784CIP2B 545 


7127 


861 


2647 


4433 


6219 


784CIP2B 546 


713 0 


862 


2648 


4434 


6220 


784CIP2B 547 


7131 


863 


2649 


4435 


6221 


784CIP2B 548 


7144 


864 


2650 


4436 


6222 


784CIP2B 549 


7159 


865 


2651 


4437 


6223 


784CIP2B 550 


7163 
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SEQ ID NO: 


SEQ ID 


SSQ ID NO: 


SEQ ID 


Priority- 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SBQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




866 


2652 


4438 


6224 


7B4CIP2B_551 


7175 


867 


2653 


4439 


6225 


784CIP2B 552 


7188 


868 


2654 


4440 


6226 


784CIP2B 553 


7189 


869 


2655 


4441 


6227 


784CIP2B_554 


7190 


870 


2656 


4442 


6228 


784CIP2B_555 


7191 


871 


2657 


4443 


6229 


784CIP2B_556 


7203 


872 


2658 


4444 


623 0 


784CIP2B 557 


7204 


873 


2659 


4445 


6231 


784CIP2B_558 


7208 


874 


* 2660 


4446 


6232 


784CIP2B_559 


7209 


875 


2661 


4447 


6233 


7B4CIP2B_560 


7210 


876 


2662 1 


4448 


6234 


784CIP2B 561 


7216 


877 


2663 


4449 


6235 


784CIP2B_562 


7221 


878 


2664 


4450 


6236 


784CIP2B_563 


723 0 


879 


2665 


4451 


6237 


784CIP2B 564 


7237 


880 


2666 


4452 


6238 


784CIP2B 565 


7240 


881 ■ 


2667 


4453 


6239 


784CIP2B_S66 


7245 


882 


2668 


4454 


6240 


784CIP2B_567 


7250 


883 


2669 


4455 


6241 


784CIP2B 568 


7251 


884 


2670 


4454 


S242 


784CIP2B 569 


725$ 


885 


2671 


4457 


6243 


784CIP2B_570 


7260 


886 


2672 


4458 


6244 


784CIP2B_571 


7265 


887 


2673 


4459 


6245 


784CIP2B 572 


7268 


888 


2674 


4460 


6244 


*73 


7275 


889 


2675 


4461 


6247 


784CIP2B 574 


7279 


890 


2676 


4462 "~ 


6248 


784CIP2B_S7S 


7283 


891 


2677 


4463 


6249 


784CIF2B_576 


7283 


892 


2678 


4464 


6250 


7B4CIP2B_577 


7287 


893 


2679 


4465 


6251 


734CIP2B_57S 


7301 


894 


2680 


4466 


6252 


784CIP2B_579 


7308 


895 


2681 


4467 


6253 


784CIP2B_580 


7308 


896 


2682 


4468 


6254 


784CIP2B_S81 


7309 


897 


2683 


4469 


6255 


784CIP2B_582 


7319 


898 


2684 


4470 


6256 


7B4CIP2B__S83 


7320 


899 


2685 


4471 


6257 


784CIP2B_584 


7326 


900 


2686 


4472 


6258 


784CIP2B_585 


7326 


901 


2687 


4473 


6259 


784CIP2B 586 


7334 


902 


2688 


4474 


6260 


784CIP2B_587 


7337 


903 


2689 


4475 


6261 


784CIP2B_586 


7339 


904 


2690 


4476 


6262 


784CIP2B_589 


7344 


905 


2691 


4477 


•6263 


784CIP2B_590 


7355 


Sod 


2692 


447B 


6264 


784CIP2B_591 


7363 


907 


2693 


4479 


6265 


784CIP2B 592 


7363 


308 


.2694 


4480 


6266 


784CIP2B_593 


7365 


909 


2695 


4481 


6267 


784CIP2B 594 


736 8 


910 


2696 


4482 


6268 


784CIP2B_59S 


7369 


911 


2697 


4483 


6269 


784CIP2B_596 


7372 


912 


2698 


4484 


6270 


784CIP2B_599 


7375 


913 


2699 


4485 


6271 


784CIP2B_600 


7381 


914 


2700 


4486 


6272 


784CIP2B_601 


7383 


915 


2701 


4487 


6273 


784CIP2B_602 


7387 


916 


2702 


4488 


6274 


784CIP2B_603 


7391 


917 


2703 


4489 


6275 


784CIP2B_604 


7393 


918 


2704 


4490 


6276 


784CIP2B_605 


7395 


919 


2705 


4491 


6277 


784C1P2B 606 


7397 


920 


2706 


4492 


6278 


7B4CIP2B_607 


7399 


921 


2707 


4493 


6279 


784CIP2B_608 


7405 


922 


2708 


4494 


6280 


784CIP2B 609 


7406 


923 


2709 


4495 


6281 


7B4CIP2B 610 


7406 


924 


2710 


4496 


6282 


784CIP2B_611 


7409 


925 


2711 


4497 


6283 


784CIP2B 612 


7410 


926 


2712 


4498 


6284 


784CIP2B 613 


7411 


927 


2713 


4499 


6285 


784CIP2B 614 


7417 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


S3Q ID NO: 
of confcig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number__ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO : in 
U.S. S.N. 
09/488, 725 


928 


2714 


4500 


6286 


784CIP2B_615 


7418 


929 


2715 


4501 


6287 


784CIP23_6l6 


7421 


930 


2716 


4502 


6288 


784CIP2B 617 


7422 


931 


2717 


4503 


6289 


784CIP23 618 


7422 


932 


2718 


4504 


6290 


784CIP2B_619 


7423 


93 3 


2719 


4 505 


6291 


784CIP23_620 


7424 


934 


2720 


4506 


6292 


784CIP2B 621 


7426 


93 5 


2721 


4507 


6293 


784CIP23_622 


7427 


93 6 


2722 


4508 


6294 


784CIP2B_623 


742B 


93 7 


2723 


4509 


6295 


784CIP2B_624 


7430 


938 


2724 


4510 


6296 


784CIP23 625 


7435 


939 


2725 


4511 


6297 


784CIP2B_626 


7437 


940 


2726 


4512 


6298 


784CIP2B_627 


7439 


941 


2727 


4513 


6299 


784CIP2B_628 


7440 


942 


2728 


4514 


6300 


784CIP23_629 


7442 


943 


2729 


4515 


6301 


784CIP2B 630 


74 SO 


944 


2730 


4516 


6302 


784CIP2B_631 


7451 


945 


2731 


4517 


6303 


784CIP2B 632 


7452 


946 


2732 


' 4518 


6304 


784CIP23_633 


74S4 


94 7 


2733 


4519 


6305 


784CIP2B_634 


7457 


94 8 


2734 


4520 


6306 


784CIP2B_635 


7459 


949 


2735 


4521 


6307 


784CIP2B 636 


7461 


950 


2736 


4522 


6306 


784CIP2B_637 


7463 


951 


2737 


4523 


6309 


784CIP2B 638 


7466 


952 


2738 


4524 


6310 


784CIP2B 639 


7469 


953 


2739 


4525 


6311 


784CIP2B_640 


74 73 


954 


2740 


4526 


6312 


784CIP2B_641 


7481 


955 


2741 


4S27 


6313 


784CIP2B 642 


7482 


956 


2742 


4528 


6314 


784CIP2B 643 


7482 


957 


2743 


4529 


6315 


784CIP2B_644 


7483 


958 


2744 


4530 


6316 


784CIP2B_645 


7485 


959 


2745 


4531 


6317 


784CIP2B_646 


7486 


960 


2746 


" 4532 


6318 


784CIP2B 647 


7487 


961 


2747 


4533 


6319 


784CIP2B_648 


7491 


962 


2748 


4534 


6320 


784CIP23_649 


7492 


963 


2749 


4535 


6321 


784CIP2B_650 


7494 


964 


2750 


453 6 


6322 


784CIP23 651 


7498 


965 


2751 


4537 


6323 


784CIP2B_652 


7504 


966 


2752 


453 8 


6324 


784CIP23 653 


7508 


967 


2753 


453 9 


6325 


784CIP2B_654 


7516 


968 


2754 


4540 


6326 


784CIP2B 655 


7518 


969 


2755 


4541 


632 7 


784CTP2B_656 


7519 


970 


2756 


4542 


6328 


784CIP2B_657 


7521 


971 


2757 


4543 


6329 


784CIP23_658 


7529 


972 


2758 


4544 


6330 


784CIP2B_659 


7532 


973 


2759 


4545 


6331 


784CIP23_660 


753 3 


974 


2760 


4546 


6332 


784CIP2B_661 


7535 


975 


2761 


4547 


6333 


7B4CIP2B_662 


7545 


976 


27S2 


4548 


6334 


784CIP2B_663 


7546 


977 


2763 


4549 


633S 


784CIP2B_664 


7552 


978 


2764 


4550 


6336 


784CIP2B 665 


7554 


979 


27S5 


4551 


6337 


784CIP2B_666 


7567 


980 


2766 


4552 


6338 


784CIP23_667 


7569 


981 


2767 


4553 


6339 


784CIP2B 668 


7575 


982 


2768 


4554 


6340 


784CIP23_669 


7576 


983 


27S9 


4555 


6341 


784CIP23_670 


7577 


984 


2770 


4556 


6342 


784CIP2B_671 


7579 


985 


2771 


4S57 


6343 


784CIP23 672 


7582 


986 


2 772 


4558 


6344 


784CIP2B_673 


7587 


987 


2773 


4559 


6345 


784CIP23 674 


7589 


988 


2774 


4560 


6346 


784CIP2B 675 


7597 


989 


277S 


4561 


6347 


784CIP2B 676 


7597 
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abt\l ID NO: 

rt^ -Pill "l 

1 eng t h 
seouence 


SEQ ID 
NOr of 
XU.J. x - 

length 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

setjuence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
epp 1 i cat ion 


SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


990 


2776 


4562 




f a 4C X P2B m b 77 


7609 


991 


2777 




6349 


784CIP2B_678 


7609 


992 


2778 


4564 


63 50 


784CIP2B 679 


7609 


993 


2779 


4565 


O J DX 


784CIP2B 680 


7613 


994 


2780 


ACCC 


6352 


784CIP23_681 


7623 


99S 


0 7R1 

1 D JL 


450 / 


6353 


784CIP23_682 


7629 


596 


A / OA 


4568 


6354 


784CIP2B 683 


7630 


997 


n qi 

B ' O J 


4569 


6355 


784CIP2B 684 


7633 


998 




4570 


6356 


784CIP2B 685 


7635 


qqq 


2785 


4571 


6357 


784CIP2B 686 


7638 


1000 


2786 


4572 


6358 


784CIP2B_687 


763 9 


1001 


2787 


4573 


6359 


784CIP2B 688 


7646 


i n no 


2788 


4574 


6360 


7B4CIP2B 689 


7647 


t ft 


2709 


4575 


6361 


7 84CIP2B_690 


7648 


1004 


2790 


4576 


6362 


784CIP2B 691 


7658 


1005 


2791 


4577 


6363 


784CIP2B 692 


7664 


1006 


2792 


4578 


6364 


784CIP2B 693 


7664 


1007 


2 793 


4579 


6365 


784CIP2B 695 


7674 ■ 


1008 




4580 


6366 


784CIP2B 696 


7675 


1 009 


2795 


4581 


6367 


784CIP2B 697 


7676 


1010 


2796 


4582 


6368 


784CIP2B 698 


7681 


1011 


O *70*7 
/ J? / 


4583 


6369 


784CIP2B_699 


7688 


1012 


2 798 


4584 


6370 


784CIP2B 700 


7693 


1013 


/yy 


4585 


6371 


784CIP2B 701 


7694 


1014 


2 800 


4586 


6372 


784CIP2B 702 


7715 


1015 




4587 


6373 


784CIP2B 703 


7716 


1016 




4588 


6374 


784CIP2B 704 


7718 


1017 


a U j 


45B9 


6375 


784CIP2B_705 


7721 


1018 


42 a us 


4590 


6376 


784CIP2B 706 


7723 


1019 


2 805 


4591 


6377 


784CIP2B 707 


7729 


1020 


A 0 Ut> 


4592 


6378 


784CIP2B_708 


7733 


1021 




4593 


6379 


784CIP2B_709 


7735 


1022 — 


2808 


4594 


6380 


784CIP2B_710 


7741 


1023 


2809 


4595 


6381 


784CIP2B 711 


7743 


1024 


2810 


4596 


6382 


784CIP2B 712 


7748 


1025 


2811 


4597 


6383 


7S4CIP2B_713 


7749 


1026 


2812 


4598 


6384 


784CIP2B 714 


7750 


102 7 


2813 


4599 


6385 


784CIP2B 715 


7757 


"1028 


2814 


4600 


6386 


784CIP2B 716 


7759 


1029 


o pi c 

2815 


• 4601 


6387 


784CIP2B 717 


7760 


103 0 


2816 


4602 


6388 


784CIP2B 718 


7760 


1031 


2817 


4603 


6389 


784CIP2B 719 


7764 


1032 


2818 


4604 


6390 


784CIP2B 720 


7765 


1033 




4605 


6391 


784CIP2B 721 


7766 


1034 


2820 


4606 


6392 


784CIP2B 722 


7767 


1035 


2821 


ftt>U / 


6393 


784CIP2B 723 


7769 


1036 


2822 


4 608 


6394 


784CIP2B 724 


7770 


1037 


2823 


4609 


OJ73 


/H4CXP2B_725 


7774 


1038 


2824 




6396 


784CIP2B 726 


7779 


1039 


2825 






/84CIP2B 727 


7781 


104 0 


2826 


J CI *3 


6398 


784CIP2B 728 


7782 


1041 


2827 


4olj 


6399 


784CIP2B 729 


7783 


1042 


2828 


4614 


6400 


784CIP2B 730 


7787 


1043 


2829 


4615 


6401 


784CIP2B 731 


7792 


1044 


2830 


4616 


6402 


784CIP2B 732 


7795 


1045 


2831 


4617 


6403 


784CIP2B 733 


7801 


1046 


2832 


4618 


6404 


784CIP2B 734 


7807 


1047 


2833 


4619 


6405 


784CIP2B 73S 


7808 


1048 


2834 


4620 


6406 


784CIP2B 736 


7819 


1049 


2835 


4621 


6407 


784CIP2B_737 


7824 


1050 


2836 


4622 


6408 


784CIP2B 738 


7826 


1051 


2837 


4623 


6409 


7B4CIP2B 739 


7829 
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SEQ ID NO: 
of full- 
1 eng tli 

sequence 


SEQ ID 
NO: of 
full- 
length 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S.S.K. 
09/488, 725 


1052 


2838 


4624 


6410 


/oft' — Ltr<£o /4U 


7832 


1053 


2839 


4625 


6411 


TBJOTD'SB "7/1 

/o4UiF2B 741 


7839 


1054 


2840 


4626 


6412 


784CIP2B 743 


7847 


1055 


2841 


4627 


CAT \ 


/o4Llrzt9 /44 


7848 


1056 


2S47. 


4628 


6414 


7D/1PTT30O ~TA C 


7853 


1057 


2843 


4629 


6415 




7854 


1058 


2844 


463 0 


64 16 


/o4Ll PZQ^_74 7 


7856 


2059 


2845 


4631 


6417 




7862 


1060 


2846 


4632 


6418 




7865 


1061 


' 28T7 


4633 


C A 1 O 


/o4t*lP^B 750 


7874 


1062 


2848 


4634 


6420 


/D4L.IP28 751 


7877 


1063 


2849 


4635 




/o4t.iF-<fili 752 


7880 


1064 


2850 


4636 


6422 


/b4(_iP2B 753 


7882 


1065 


2851 


4637 


CAO 1 


7S4CIP2D 754 


7884 


1066 


2852 


463 8 


O 


784CIP2B 755 


7886 


1067 


2853 






784CIP2B 756 


788B 


1068 


2854 




6426 


784CIP2B 757 


7889 


1069 


2 855 


4641 


6427 


784CIP2B 758 


7901 


1070 — 




4642 


642 8 


784CIP2B 759 


7910 


1071 




4643 


6429 


784CIP2B 760 


7911 


1072 


2858 


4644 


6430 


784CIP2B_761 


7921 


XU / J 


2859 


4645 


6431 


784CIP2B 762 


7923 


1074 


2860 


4646 


6432 


784CIP2B 763 


7924 


1075 


2861 


4647 


6433 


784CIP2B 764 


7925 


1076 


2862 


4648 


6434 


784CIP2B 765 


7928 


1077 


2863 


4649 


6435 


784CIP2B 7<d6 


7929 


1078 


2864 


4650 


6436 


784C1P2B 767 


7930 


1079 


2865 


4651 


6437 


784CIP2B 768 


7934 


1080 


2B66 


4652 


6438 


784CIP2B 769 


7938 


±uax 


2 867 


4653 


6439 


784CIP2B 770 


7942 




2 868 


4654 


6440 


784CIP2B 771 


7945 


1083 


O BC Q 
4 DO? 


4655 


6441 


784CIP2B 772 


7946 


1084 


2 870 


4656 


6442 


784CIP2B 773 


794 8 


1085 


O All 


4657 


6443 


784CIP2B 774 


7951 


1086 


2 872 


4658 


e: a a a 
6444 


784CIP2B 775 


7952 


1067 


28 73 




caac 

6445 


784CIP2B 776 


7953 


1088 


28 74 


4660 


6446 


784CIP2B 777 


79S4 


1089 


2875 


'stOO-L 


6447 


784CIP2B 778 


7957 


1090 


2876 


4 652 


644 8 


784CIP2B 779 


7958 


1091 


2877 


4 663 


6449 


784CIP2B 7 BO 


7961 


1092 


2878 


4664 


5450 


784CIP2B 781 


7965 


1093 


2879 


4665 




7 B4CIP2B_782 


7966 


1094 


2880 


4666 


6452 




7979 


1095 


2881 


4 667 


6453 


/o^uJLir^o /B4 


7986 


1096 


2882 


4668 


6454 


/o^tlWB 785 


7986 


1097 


2883 


4669 




7o4CIP2B 786 


7988 


1098 


2884 


4670 


6456 


/ Ofi' — Lrzs / 0 / 


7991 


1099 


2885 


4671 


6457 


/o*i^Xir*st> /Ho 


7992 


1100 


2886 


4672 


£A Cft 


Idyl PTllin nori 

/B4C1P2B 789 


7992 


1101 


2887 


4673 


0 TtO J 


/o^t-lf^B 790 


7992 


1102 


2888 


4 674 


6450 


784CIP2B 791 


7992 


1103 


2889 


4b/3 


ZTiTi """" ' 


784CIP2B 792 


8003 


1104 


2890 


4676 


6462 


784CIP2B 79^ 


ftni a. 


1105 


2891 


4677 


6463 


784CIP2B 794 


8015 


1106 


2892 


4678 


6464 


784CIP2B 795 


8016 


1107 


2893 


4679 


6465 


784CIP2B 796 


8017 


1108 


.2894 


4680 


6466 


784CIP2B 797 


8019 


1109 


2895 


4661 


6467 


784CIP2B 798 


8020 


1110 


2896 


4682 


6468 


784CIP2B 799 


8022 


1111 


2897 


4683 


6469 


784CIP2B 800 


8022 


1112 


2898 


4684 


6470 


784CIP2B 801 


8028 


1113 


2899 


4685 


6471 


784CIP2B 802 


8030 1 
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SEQ ID NO: 
of full- 
length 


SEQ ID 
NO: of 
full- 
length 

Yt art r» A *H & 

ycpciae 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priori ty 
docket number — 
corresponding 
SEQ ID NO: in 
priority- 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1114 


2900 


46 86 




784CIP2B 803 


8038 


1115 


2901 




6473 


784CIP2B 804 


8042 


1116 


2902 


4688 


6474 


784CIP2B B05 


8045 


1117 




4689 


64 75 


784CIP2B 806 


8045 


1118 




4690 


6476 


784CIP2B 807 


8046 


mo 


2905 


4691 


6477 


784CIP2B 808 


8047 




2906 


4692 


6478 


784CIP2B 809 


8051 




2907 


4693 


6479 


7S4CIP2B 810 


8059 




2908 


4694 


6480 


784CIP2B 811 


8064 


1 1 "5"! 


2909 


4695 


6481 


784CIP2B 812 


8069 


1124 


2910 


4596 


6482 


784CIP2B 813 


8074 


1125 


2911 


4697 


6483 


784CIP2B 814 


8077 


1126 


2912 


4698 


6484 


784CIP2B 815 


8078 


1127 


2913 


4699 


6465 


784CIP2B 816 


8079 


1128 


2914 


4700 


6486 


784CIP2B 817 


8084 


1129 


2515 


4701 


6487 


784CIP2B 818 ' 


• 8088 


1130 


2916 


4702 


6488 


784CIP2B 819 | 8090 


1131 


2917 


4703 


6489 


784CIP2B 820 


8091 




2918 


4704 


6490 


784CIP2B 821 


8099 




2919 


4705 


6491 


784CIP2B 822 


8099 




2920 


470(> 


6492 


784CIP2B 823 


8100 


TTTr 


2921 


4707 


6493 


784CIP2B 824 


8102 


113 6 


2922 


4708 


6494 


784CIP2B 825 


8103 


J.JLJ / 


2923 


4709 


6495 


784CIP2B 826 


8103 


1 1 % ft 

•L J. O O 


2924 


4710 


6496 


784CIP2B 827 


8104 


1139 




4711 


6497 


784CIP2B 828 


8108 


1140 


292 6 


4712 


6498 


784CIP2B 829 


8110 


1 X4 1 


2927 


4713 


6499 


784CIP2B_830 


8116 


JL J.*t <6 


292 8 


4714 


6500 


784CIP2B 831 


8117 


1113 


2929 


4715 


6501 


784CXP2B_832 


8123 


1144 


2930 


4716 


S502 


784CIP2B 833 


8130 


1145 


2931 


4717 


6503 


784CIP2B 834 


8130 


1146 


2932 


4718 


6504 


784CIP2B 835 


8143 


1147 


2933 


4719 


6505 


784CIP2B 836 


8143 


1148 


2934 


4720 


6506 


784CIP2B 837 


8154 


1149 


2935 


4721 


6507 


784CIP2B 838 


8155 


1150 


2936 


4722 


6503 


784CIP2B 839 


8162 


1151 


293 7 


4723 


6509 


7B4CIP2B 840 


6163 


1152 


293 8 


4 724 


6510 


784CIP2B 841 


8172 


1153 


2939 


4725 


6511 


784CIP2B 842 


8173 


1154 


2940 


4726 


6512 


784CIP2B_843 


8179 


HDD 


2941 


4727 


6513 


784CIP2B 844 


8182 




2942 


4728 


6514 


784CIP2B 845 


8183 


1157 


2943 


4729 


6515 


784CIP2B 846 


8184 


1158 


2944 


4730 


6516 


784CIP2B 847 


8185 


1159 




4 731 


6517 


784CIP2B 848 


8187 


1160 


2S46 


4732 


6518 


784CIP2B 84 9 


8188 


1161 




4733 


6519 


784CIP2B 850 


8190 


1162 


294 8 


4734 


6520 


784CIP2B 851 


8190 


1163 


2949 


4 735 


6521 


784CIP2B 852 


8192 


1164 


2950 


4 736 


6522 


784CIP2B 853 


8193 


1165 




4737 


6523 


784CIP2B 854 


8197 


1166 


2952 


4738 


6524 


784CIP2B 855 


8197 


1167 


2953 


4739 


6525 


784CIP2B 856 


8199 


1168 


2954 


4740 


£526 


784CIP2B 857 


8202 


1169 


2955 


4741 


6527 


784CIP2B 858 


8203 


1170 


2956 


4742 


6528 


784CIP2B 859 


8208 


1171 


2957 


4743 


. 6529 


784CIP2B 860 


8209 


1172 


2958 


4744 


6530 


784CIP2B 861 


8211 


1173 


2959 


4745 


6531 


784CIP2B 862 


8214 


1174 


2960 


4746 


6532 


784CIP2B 863 


8217 


1175 


2961: 


4747 


6533 


784CIP2B_864 | 8223 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Prio jri fcy 


SEQ ID 


of full- 


NO: of 


of con tig 


NO; 


docket number 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S .S,N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




1176 


-2962 


4748 


6534 


784CIP2B 865 


8224 


1177 


2963 


4749 


653S 


784CIP2B_8^- 


B226 . " 


1178 


2964 


4750 


6536 


784CIP2B 867 


8227 


1179 


2965 


4751 


6537 


784CIP2B_868 


8229 


1180 


2966 


4752 


6538 


784CIP2B_869 


8232 


1181 


2967 


4753 


£539 


784CIP2B_B70 


823 6 


1182 


2968 


4754 


6540 


784CIP2B 871 


8239 


1103 


2969 


4755 


6541 


784CIP2B_872 


8244 


1184 


2970 


4756 


6542 


784CIP2B_873 


8245 


1185 


2971 


4757 


6543 


784CIP2B_874 


8248 


use 


2972 


4758 


6544 


784CIP2B 875 


8251 


1187 


2973 


4759 


6545 


784CIP2B_876 


8253 


1188 


2974 


4760 


6546 


784CIF2B_877 


8260 


1189 


2975 


4761 


6547 


784CIP2B_878 


8262 


1190 


2976 


4762 


6548 


784CIP2B 879 


8268 


1191 


2977 


4763 


6549 


784CIP2B_BB0 


8270 


1192 


2978 


4764 


6550 


784CIP2B_8 6l 


8272 


1193 


2979 


4765 


6551 


784CIP2B_882 


8274 


1194 


2980 


■™ 4766 


6SS2 


784CIP2B_8 83 


8274 


1195 


2981 


4757 


6553 


784CIP2B_864 


B275 


1196 


2982 


4768 


6554 


784CIP2B_88S 


8277 


1197 


2983 


4769 


6555 


784CIP2B 886 


8281 


1198 


2984 


4770 


6556" 


784CIP2B 887 


8283 


1199 


2985 


4771 


6557 


7B4CIP2B 888 


8289 


1200 


2986 


4772 


6558 


784CIP2B 889 


8295 


1201 


2987 


4773 


6559 


784CIP2B 890 


8300 


1202 


2988 


4774 


6560 


784CIP2B 891 


8303 


1203 


2989 


4775 


6561 


784CIP2B 892 


8304 


1204 


2990 


4776 


6562 


784CIP2B 893 


8305 


1205 


2991 


4777 


6563 


784CIP2B_894 


8309 


1206 


2992 


4778 


6564 


784CIP2B 895 


8318 


1207 


2993 


4779 


6565 


784CXP2B_Ji9G 


8319 


1208 


2994 


4780 


6566 


784CIP2B_897 


8321 


1209 


2995 


4781 


6567 


784CTP2B_898 


8322 


1210 


2996 


4782 


6566 


784CIP2B_S99 


8323 


1211 


2997 


4783 


6569 


784CTP2B_900 


8325 


1212 


2998 


4784 


6570 


784CIP2B 901 


8331 


1213 


2999 


4785 


6571 


784CIP2B_902 


8332 


1214 


3000 


4786 


6572 


784CIP2B_903 


8333 


1215 


3001 


4787 


6573 


784CIP2B_904 


83^5 


1216 


3002 


4788 


6574 


784CIP2B_905 


8336 


1217 


3003 


4789 


6575 


784CIP2B_906 


8337 


1218 


3004 


4790 


6576 


784CIP2B_907 


8340 


1219 


3005 


4791 


6577 


784CIP2B_908 


8343 


1220 


3006 


4792 


6578 


784CIP2B_909 


8347 


1221 


3007 


4793 


6579 


784CIP2B_910 


8349 


1222 


3008 


4794 


6580 


784CIP2B_911 


8351 


1223 


3009 


4795 


6581 


784CIP2B_912 


8353 


1224 


3010 


4796 


6582 


784CIP2B_913 


8355 


1225 


3011 


4797 


6583 


784CIP2B_914 


8361 


1226 


3012 


4798 


6584 


784CIP2B_915 


8365 


1227 


3013 


4799 


tens 


784CIP2B_916 


8367 


1228 


3014 


4800 


6536 


784CIP2B 917 


8369 


1229 


3015 


4801 


6587 


784CIP2B_919 


8375 


1230 


3016 


4802 


6588 


784CIP2B 920 


8387 


1231 


3017 


4803 


■ 6589 


784CIP2B_921 


8391 


123 2 


3018 


4804 


6590 


784CIP2B 922 


8393 


1233 


3019 


4805 


6591 


784CIP2B_923 


8393 


1234 


3020 


4806 


6592 


784CIP2B_924 


8394 


1235 


3021 


4807 


6593 


784CIP2B_925 


8395 


1236 


3022 


4808 


6594 


784CIP2B 926 


8396 


123 7 


3023 ' 


4809 


6595 


784CIP2B_927 


8398 
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of full- 
lenofth 
nucleotide 
sequence 


SEQ ID 
NO : of 
full- 
length 
peptide 
secpience 


SEQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 
NO: 

of con tig 

peptide 

sequence 


; Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
appl icat ion 


SEQ ID 
NO:in 
U.S. S.N. 
09/488,725 


1238 


3024 


4810 


6596 


/o4*-l£ > 2B_928 


8402 


1239 


3 025 


4811 


CCTQ*7 


/04L.IP2B 929 


8402 


1240 


3026 


4812 


rcoo 

0370 


784CIP2B 930 


8405 


1241 


3027 


4813 




784CIP2B 931 


8406 


1242 


3028 


4814 


6600 


784CIP2B 932 


84 09 


1243 


3029 


Aft! c 


6601 


784CIP2B 933 


8410 


1244 


3030 


4816 


6602 


784CIP2B 934 


8414 . 


1245 


3031 




6603 


784CIP2B 935 


8415 


1246 


3032 


4818 


6604 


784CIP2B_936 


8419 


1247 


J uo J 


4 819 


6605 


784CIP2B 937 


5426 


1248 


3034 


4820 


6606 


7B4CIP2B 938 


8430 


1249 




4821 


6607 


784CIP2B 939 


8431 


1250 


3036 


4 822 


6608 


784CIP2B 940 


8432 


1251 


303 7 


4823 


6609 


784CIP2B 941 


8433 


1252 


3 03 8 


4824 


6610 


784CIP2B_942 


8434 


1253 




4325 


6611 


784CIP2B 943 


8438 


1254 


3 040 


4826 


6612 


784CIP2B 944 


8439 


12S5 


*3 OA 1 
J U*fc JL 


4827 


6613 


784CIP2B 945 


8441 


1256 




4828 


6614 


784CIP2B 946 


8450 


1257 


JU4J 


4829 


6615 


784CIP2B 947 


8451 


1258 


3044 


4830 


6616 


784CIP2B_948 


8452 


1259 




483 1 


6617 


784CIP2B„949 


8460 


1260 


3 04 6 


4832 


6618 


784CIP2B 950 


8461 


1261 


304 7 


4 833 


6619 


784C1P2B 951 


8462 


1262 


3048 


4834 


6620 


784CIP2B 952 


8464* 


1263 


3 049 


4835 


6621 


784CIP2B 953 


8465 


1264 


3 050 


4836 


6622 


784CIF2B 954 


8467 


12^5 


3051 


4 837 


6623 


784CIP2B 955 


8470 


1266 


3052 


4 838 


6624 


784CIP2B 956 


8471 


1267 


3053 


4839 


6625 


784CIP2B_957 


8473 


1268 


3054 


4 840 


6626 


784CIP2B 958 


8474 


1269 


3055 


4 841 


6627 


784CIP2B 959 


8475 


1270 


3056 


4 842 


6628 


784CIP2B 960 


8476 


1271 


JUS / 


4 843 


6629 


784CIP2B 961 


8480 


1272 


o rt c o "~ 


4844 


6630 


784CIP2B 962 


8482 


1273 




4 845 


6531 


784CIP2B 963 


8482 


1274 


JubU 


4846 


6632 


784CIP2B 964 


8486 


1275 


j Ub X 


4847 


6633 


784CIP2B 965 


8488 


1276 


3062 


4 848 


6634 


784CIP2B_966 


8492 


1277 


3063 


4849 


6635 


784CIP2B_967 


8494 


1278 


30(>4 


4 850 


6636 


784CIP2B 968 


8496 


1279 


3055 


4 851 


6637 


784CIP2B 969 


8497 


1280 - 


3066 




6638 


784CIP2B 970 


8499 


1281 


3067 


4 853 


6639 


784CIP2B 971 


8513 


1282 


3068 


4854 


6640 


784CIP2B_ 972 


8522 


1283 


3069 


_ 


6641 


784CIP2B 973 


8526 


1284 


3070 


4 856 


6642 


784CIP2B 974 


8531 


1285 


3071 


4 857 


6643 


784CIP2B 975 


8533 


1286 


30*72 


4 858 


6644 


784CIP2B 976 


8542 


1287 


3073 


4 859 


6645 


784CIP2B 977 


8544 


1288 


3074 




6646 


784CIP2B_978 


B565 


1289 


3075 


tool 


6647 


784CIP2B 979 


8565 


1290 


3076 


4862 


6648 


784CIP2B 980 


8572 


1291 


3077 


48^3 


6649 


784CIP2B 981 


8576 


1292 


3078 


4864 


6650 


784CIP2B 982 


8578 


1293 


3079 


4865 


6651 


784CIP2B 983 


8584 


1294 


3080 


4866 


6652 


784CIP2B 984 


8598 


1295 


3081 


4 867 


6653 


784CIP2B 985 


8602 


1296 


3082 


4868 


6654 


784CIP2B 986 


8604 


1297 


3083 


4869 


6655 


784CIP2B 987 


8609 


1298 


3084 


4870 


6656 


784CIP2B 988 


8612 


1299 


3085 


4871 


66S7 


784CIP2B 989 


8637 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1300 


3086 


4872 


6658 


784CIP2B 990 


8640 


1301 


3087 


4873 


6659 


784CIP2B_991 


8643 


1302 


3088 


4874 


6660 


784CIP2B 992 


8645 


1303 


3089 


4875 


6661 


784CIP2B 993 


8650 


1304 


3090 


4876 


6662 


784CIP2B 994 


8651 


1305 


3091 


4877 


6663 


784CIP2B 995 


8654 


13 06 


3092 


4878 


6664 


784CIP2B 996 


8655 


13 07 


3093 


4879 


6665 


784CIP2B 997 


8657 


1308 


3094 


4880 


6666 


784CIP2B 998 


8665 


1309 


3095 


4881 


6667 


784CIP2B 999 


8668 


1310 


3096 


4882 


6668 


784CIP2B_1000 


8671 


1311 


3097 


4883 


6669 


784CIP2B 1001 


8672 


1312 


3098 


4884 


6670 


784CIP2B 1002 


8692 


1313 


3099 


4885 


6671 


784CIP23_1003 


87C6 


1314 


3100 


4886 


6572 


784CIP23 1004 


8716 


1315 


3101 


4887 


£673 


784CIP2B 1005 


8719 


1316 


3102 


4888 


6674 


784CIP2B_1006 


8743 


1317 


3103 


4889 


6675 


784CIP2B 1007 


8764 


1318 


3104 


4890 


6676 


784CIP2B 1008 


8764 


1319 


3105 


4891 


6" £77 


784CIP2B 1009 


8764 


1320 


3106 


4892 


6S78 


784CIP2B 1010 


8774 


1321 


3107 


4893 


6679 


784CIP2B_1011 


8782 


1322 


3108 


4894 


6680 


784CIP2B_1012 


8796 


13 23 


3109 


4895 


6681 


784CIP2B 1013 


8827 


1324 


3110 


4896 


6682 


7B4CIP2B 1014 


8842 


1325 


3111 


4897 


6683 


784CIP2B_1015 


8842 


1326 


3112 


4898 


6684 


784CIP2B 1016 


8858 


1327 


3113 


4899 


6685 


784CIP2B 1017 


8871 


1328 


3114 


4900 


6686' 


784CIP2B 1018 


8921 


1329 


3115 


4901 


6687 


784CIP2B 1019 


8927 


1330 


3116 


4902 


6688 


784CIP2B 1020 


8942 


1331 


3117 


4903 


6689 


784CIP2B_1021 


"8 994 


1332 


3118 


4504 


" 6-690 


7d4ciP2B_1022" 


9023 


1333 


3119 


4905 


6691 


784CIP2B 1023 


9028 


1334 


3120 


4906 


6692 


784CIP2B_1024 


9058 


1335 


3121 


4907 


6693 


784CIP2B_1025 


9058 


1336 


3122 


4908 


6694 


784CIP2B_1026 


9079 


1337 


3123 


4909 


6695 


784CIP2B 1027 


9079 


1338 


3124 


4910 


6696 


784CIP2B_1028 


9082 


1339 


3125 


4911 


6697 


784CIP2B_1029 


9084 


1340 


3126 


4912 


6698 


784CIP2B_1030 


9093 


1341 


3127 


4913 


6699 


784CIP2B 1031 


9101 


1342 


3128 


4914 


6700 


784CIP2B_1032 


9103 


1343 


3129 


4915 


6701 


784CIP2B_1033 


9105 


1344 


3130 


4916 


6702 


784CIP2B_1034 


9151" 


1345 


3131 


4917 


6703 


784CIP2B 1035 


9161 


1346 


3132 


4918 


6704 


784CTP2B_1036 


9172 


1347 


3133 


4919 


6705 


784CIP2B_1037 


9174 


1348 


3134 


4920 


6706 


7S4CIP2B_1038 


9204 


1349 


3135 


4921 


6707 


784CIP2B 1039 


9234 


1350 


3136 


4922 


6708 


784CIP2B_104O 


9235 


1351 


3137 


4923 


6709 


784CIP2B_1041 


9239 


1352 


3138 


4924 


6710 


784CIP2B_1042 


9256 


1353 


3139 


4925 


6711 


784CTP2B_1043 


9276 


1354 


3140 


4926 


6712 


784CTP2B_1044 


5345 


1355 


3141 


4927 


6713 


784CIP2B_1045 


9379 


1356 


3142 


4928 


6714 


784CTP2B_1046 


9435 


1357 


3143 


4929 


£715 


784CIP2B_1047 


9437 


1358 


3144 


4930 


6716 


784CIP2B 1048 


9469 


1359 


3145 


4931 


6717 


784CIP2B_1049 


9500 


1360 


3146 


4932 


6718 


784CIP2B 1050 


9502 


1361 


3147 


4933 


6719 


784CIP2B 1051 


9520 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


ojiy XxJ 
NO: of 
full- 
length 
peptide 
sequence 


of cnTi }" "i rr 

nucleotide 
sequence 


SEQ ID 
NO ■ 

cone j. y 
nent i He* 


Priority 
docket mimber_ 
c or re spon ding 

Jfc>i JLUi jl ty 
aDtilicati on 


SEQ ID 
NO: in 
TJ.S.S .N. 
09/488, 725 


1362 


3148 


4934 


6720 


784CIP2B 1052 




1363 


3149 


4935 


6721 


784CIP3R 1fm^~ 


9541 


1364 


3150 


4936 


6722 


784CIP9R 1 fl^A 


9548 


1365 


3151 


4937 


6723 


7fi4r , TP'5R i nc;c; 


9556 


1366 


3152 


493 8 


6724 




9556 


1367 


3153 


4939 


6725 




9575 


1368 


3154 


4940 


6726 


/o41~J.£'^d 1058 


9589 


1363 


3155 


4941 


6727 




9599 


1370 


3156 


4942 


6728 


/H4L.XP2B 1060 


9602 


1371 


3157 


4943 


C T> o 


784CIP2B 1061 


9606 


1372 


3158 


4944 


O fJ V 


784CIP2B 1062 


9622 


1373 


3159 


** 3*4 3 


b / J i. 


784CIP2B 1063 


9623 


1374 


3160 


494 6 


O * A £ 


7S4CIP2B 1064 


9646 


1375 


3161 


4947 


6733 


784CIP2B 1065 


9747 


137S 


3152 


4948 


O / J» 


784CIP2B 1066 


9773 


1377 


3163 


4949 


6735 


784CXP2B 1067 


9785 


1378 


3164 


4950 


6736 


784CIP2B 1068 


9801 


1379 


3165 


4951 


6737 


T O jl T niD n r\ f~ r\ 


9811 


13B0 


3166 


4952 


o/jo 


784CIP2B_1070 


9843 


1381 


3167 


* 4953 


6739 


/Q4CIP2B 1071 


9854 


1382 


3168 


4954 


6740 


784CIP2B 1072 


9854 


1383 


3169 " 


4955 


fi7d 1 
© f*kX 


/84CIP2B 1073 


9864 


1384 


3170 


4956 


6742 


/B4CIP2B 1074 


9864 


1385 


3171 


4957 


674 3 


/S4CXP2B 1075 


9871 


1386 


3172 


4958 


6744- 


/04CIP2B 1076 


9879 


1387 


3173 


4959 


6 74 5 


/o41C-1P^B 107/ 


9881 


1388 


3174 


4960 


6746 


784CIP2B 1078 


9885 


1389 


3175 


4961 




784CIP2B 1079 


9901 


1390 


317S 


4962 




784CIP2B 10B0 


9912 


1391 


3177 


4 963 


674 9 


784CIP2B 1081 


9916 


1392 


3178 


4964 


6750 


784CIP2B 1082 


9921 


1393 


3179 


4965 


C7C1 


784CIP2B_1083 


9925 


1394 


3180 


4966 


6752 


784CIP2B 1084 


9930 


1395 


3181 


4 967 


6753 


784CIP2B 1085 


9949 


1396 


3182 


4968 


6754 


V84CXP2B 1086 


9951 


1397 


3183 


4969 




784CIP2B 1087 


9559 


1398 


3184 


4970 


D /3D 


784CXP2B 1088 


9973 


1399 


3185 


4971 


6757 


784CIP2B 1089 


9982 


1400 


3186 


4972 


C7CQ 

b /so 


784C1P2B 1090 


9994 


1401 


3187 


4973 


6759 


/B4C,IP2B 1091 


10021 


1402 


3188 


4974 


6760 


784CIP2B 1092 


10041 


1403 


3189 


4975 


0 / Di. 


784CIP2B 1094 


10067 


1404 


3190 


4976 


37 62 




10073 


1405 


3191 


4977 


6763 


/ r a4CXP2B 1096 


10112 


1406 


3192 


4978 


6764 




10117 


1407 


3193 


4979 


5765 




10132 


1408 


3194 


4980 


6766 




10169 


1409 


319£ 


4981 


6767 


7fl4PTT)OR T t nn 


10217 


1410 


3196 


4982 


6768 




1022 6 


1411 


3197 


4933 


6769 


7ft4PTP9R urn 


10232 


1412 


3198 


4984 


6770 


7fl4t!!IP5S nni '"' 


10237 


1413 


3199 " " 


4985 


6771 




10279 


1414 


3200 


4986 


6772 


784CIP2C 1 


33 


1415 


3201 


4987 


6773 


784CIP2C 2 


271 


1416 


3202 


4988 


6774 


784CIP2C 3 


84 8 


1417 


3203 


4989 


6775 


784CIP2C 4 


849 


1418 


3204 


4990 


6776 


784CIP2C 5 


864 


1419 


3205 


4991 


6777 


784CIP2C 6 


953 


1420 


3206 


4992 


6778 


784CIP2C 7 


980 


1421 


3207 


4993 


£779 


784CIP2C 8 


1595 


1422 


3208 


4994 


6780 


784CIP2C 9 


1697 


1423 


3209 


4995 


6781 


784CIP2C_10 


1744 
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SEQ ID NO: 
o£ full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
o£ contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 

SEQ ID NO: in 

priority 

application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


L 1424 


3 210 


4996 


6782 


784CIP2C 11 


1937 


1425 


3211 


4997 


6783 


784CIP2C 12 


1955 


1426 




4998 


6784 


784CIP2C 13 


1955 


1427 


3 213 


4999 


6785 


784CIP2C 14 


2185 


1428 


3214 


5000 


6786 


784CIP2C 15 


2889 


1429 


3 215 


cnni 
duux 


6787 


784CIP2C 16 


2901 


1430 


3216 


5002 


6788 


784CIP2C 17 


2902 


1431 


3217 


5003 


6789 


/b4(-Ifc>3C 18 


2905 


1432 


3218 


dUU4 


6790 


784CIP2C 19 


2948 


1433 


3 219 




6791 


784CIP2C 20 


2956 


1434 


3 220 




6792 


7 0 4 CI P2C_2 1 


2959 


1435 


3 221 




6793 


7S4CIP2C 22 


2965 


1436 


3 222 


5008 


G1QA 

o7S4 


784CIP2C 23 


2966 


143 7 


ion 


5009 


6795 


784CIP2C 24 


2970 




3 224 


5010 


6796 


784CIP2C 25 


2985 


■1 / I c* 


3225 


5011 


6797 


784CIP2C 26 


2987 


144 0 




5012 


6798 


784CIP2C 27 


2993 


1441 


3227 


5013 


6799 


784CIP2C_28 


2993 


1442 


3228 


5014 


6800 


784CIP2C 29 


3017 


1443 


3229 


5015 


6801 


784CIP2C_3 0 


3046 


1444 


3230 


5016 


6802 


784CIP2C 31 


3050 


1445 


3231 


5017 


6803 


784CIP2C 32 


3357 


1446 


3232 


5018 


6804 


784CIP2C 33 


3359 


144 7 


3233 


5019 


6805 


784CIP2C 34 


3432 


144 8 


3234 


5020 


6806 


784CIP2C 35 


3438 


1449 


3235 


5021 


6807 


784CIP2C 36 


3439 


1450 


3236 


5022 


6808 


784CIP2C_39 


3463 


1451 


3237 


5023 


6809 


784CIP2C 40 


3466 


1452 


3238 


5024 


6810 


784CIP2C 41 


3466 


1453 


3239 


5025 


6311 


784CIP2C 42 


3467 


1454 


3240 


5026 


6312 


784CIP2C_43 


3468 


1455 


3241 


5027 


6813 


784CIP2C_44 


3483 


1456 


3242 


502B 


6814 


784CIP2C 45 


3484 


1457 


3243 


5029 


6815 


784CIP2C 46^ 


3488 


1458 


3244 


5030 


6816 


784CIP2C_47 


3491 


1459 


3245 


5031 


6817 


784CIP2C_48 


3493 


14 60 


3246 


5032 


6818 


784CIP2C 49 


3494 ~ 




3247 


5033 


6819 


784CIP2C 50 


3495 


1462 


3248 


5034 


6820 


784CIP2C 51 


3496 


1463 




5035 


6821 


784CIP2C 52 


3503 


14 64 




503 6 


6822 


784CIP2C 53 


3503 


1465 


3251 


c n 't '§ 

DUJ / 


6 823 


784CIP2C 54 


3504 


14 66 


3252 


503 8 


6 8 24 


784CIP2C 55 


3511 


1467 


3253 


503 9 


6825 


784CIP2C 5.6 


3531 


1468 


3254 




ba^o 


784CIP2C 57 


3536 


1463 


3255 




6827 


/ o4L_i,P2C < _58 


3546 


1470 


3256 


5642 " 


6828 


•V Q A HTnin r" ft 

/o4t*IP2C 59 


3548 


1471 


3257 


5043 


O Oil? 


/U4CIP2C 60 


3551 


1472 


3258 




com 


784CIP2C_61 


3553 


1473 




5045 


6831 


784CIP2C 62 


3564 


1474 


3 ZD U 


5046 


6832 


784CIP2C 63 


3567 


14 75 


3 261 


5047 


6833 


784CIP2C 64 


3572 


1476 


3262 


504 8 


6834 


7flAr , TD5(*' cc 
/ o4Llr O i> 


3 573 


1477 


3263 


5049 


6835 


784CIP2C 66 


3574 


1478 


3264 


5050 


6836 


784CIP2C 67 


3583 


1479 


3265 


5051 


6837 


784CIP2C 68 


3615 


1480 


3266 


5052 


6838 


784CIP2C_69 


3623 


1481 


3267 


5053 


6839 


784CIP2CJ70 


3629 


1462 


3268 


5054 


6840 


784CIP2C 71 


3666 


1483 


3269 


5055 


6841 


784CIP2C 72 


3667 


1484 


3270 


5056 


6842 


784CIP2C 73 


3906 


1485 


3271 


5057 


6843 


7 84CIP2C 74 


3912 
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SEQ ID NO i 
of full- 


SEQ ID 
NO: of 


SEQ ID NO: 
of contig 


SEQ ID 
NO: 


Priority 
docket number^ 


SEQ ID 
NO: in 


length 

nucleotide 

sequence 


full- 
length 
peptide 
sequence 


nucleotide 
sequence 


of contig 

peptide 

sequence 


co rr e sponding 
SEQ ID NO: in 
priority- 
application 


U.S. S.N. 
09/488,725 


1486 


3272 


5058 


6844 


784CIP2C_7S 


3924 


1487 


3273 


5059 


6845 


784CIP2C 76 


3928 


1483 


3274 


5060 


6846 


784CIP2C 77 


3935 


1489 


3275 


5061 


6847 


784CIP2C 78 


3959 


1490 


3276 


5062 


6848 


784CIP2CJ79 


3981 


1491 


3277 


5063 


6849 


784CIP2C_80 


3989 


1492 


327B 


5064 


6850 


784CIP2C 81 


4295 


1493 


3279 


5065 


6851 


784CIP2C 82 


4300 


1494 


3280 


5066 


6852 


784CIP2C_83 


4360 


1495 


3281 


5067 


6853 


784CIP2C: 84 


4362 


1496 


3282 


5068 


6854 


784CIP2C_85 


4371 


• 1497 


3283 


5069 


6855 


784CIP2C_86 


4373 


1498 


3284 


5070 


6356 


784CIP2C_87 


4376 


1499 


3285 


5071 


6857 


784CIP2C 89 


4378 


1500 


3286 


5072 


6858 


784CIP2C 90 


4382 


1501 


3287 ~ 


5073 


6859 


784CIP2C_91 


4409 


■ 1502 


3288 


5074 


6860 


784CIP2C_92 


4421 


1503 


3289 


5075 


6861 


784CIP2C 93 


4421 


1504 


3290 


5076 


6862 


784CIP2C 94 


4426 


1505 


3291 


5077 


6863 


784CIP2C__95 


4430 


1506 


3292 


5078 


6864 


784CIP2C_96 


4435 


1507 


3293 


5079 


6B65 


784CIP2C_97 


4436 


1S08 


32 94 


5080 


6866 


784CIP2C 98 


443 9 


1509 


3295 


5081 


6867 


784CIP2C_99 


444 0 


1S10 


3296 


5082 


6868 


784CIP2C_100 


4441 


1511 


3297 


5083 


6B69 


784CIP2C 101 


4442 


1512 


3298 


5084 


6^70 


784CIP2C 102 


4455 


15Z3 


3299 


5085 


6971 


784CTP2C_103 


4462 


1514 


3300 


5086 


6872 


784CTP2C_104 


4466 


1515 


3301 


5087 


6873 


784CIP2C_105 


4469 


1516 


3302 


5088 


6374 


784CTP2CJL06 


4477 


1517 


3303 


5089 


6875 


784CTP2C_107 


4481 


1518 


3304 


5090 


6876 


784CTP2C_108 


4483 


1519 


3305 


5091 


6877 


7B4CTP2C_109 


4484 


1520 


3306 


5092 


6878 


784CIP2C_110 


4486 


1521 


3307 


5093 


6379 


784CTP2C 111 


4490 


1522 


3308 


5094 


6880 


784CIP2CJU2 


4499 


1523 


3309 


5095 


6S81 


784CIP2C 113 


4503 


1524 


3310 


5096 


6S82 


784CIP2C_114 


4506 


1525 


3311 


5097 


6383 


784CIP2C 115 


4509 


152^ 


3312 


5098 


6884 


784CIP2C 116 


4514 


1527 


3313 


5099 


6885 


784CrP2C_117 


4516 


1528 


3314 


5100 


6886 


784CIP2C 118 


4522 


1529 


3315 


5101 


6887 


7B4CIP2C 119 


4525 


1530 


3316 


5102 


68B8 


784CIP2C 120 


4527 


1531 


3317 


5103 


e>889 


784CIP2C_121 


4528 


1532 


3318 


5104 


6890 


784CTP2C_122 


4529 


1533 


3319 


5105 


6891 


7B4CTP2C_123 


4532 


1534 


3320 


5106 


6892 


7B4CIP2C 124 


4537 


1535 


3321 


5107 


6893 


784CTP2C_125 


4538 


1536 


3322 


5108 


6894 


784CIP2CJ.26 


4551 


1537 


3323 


5109 


6895 


784CIP2C_127 


4552 


1538 


TSoZ 


5110 


6896 


784CIP2C 128 


4559 


1539 


3325 


5111 


6897 


784CIP2CJL29 


4567 


1540 


3326 


5112 


6898 


784CIP2C_130 


4568 


1541 


3327 


5113 


6899 


784CIP2C_132 


4585 


1S42 


3328 


5114 


6900 


784CIP2C 133 


4592 


1543 


3329 


5115 


6901 


784CIP2C 134 


4609 


1544 


3330 


5116 


6902 


784CIP2C 135 


4616 


1545 


3331 


5117 


6903 


784CIP2C 136 


4617 


1546 


3332 


5118 


6904 


784CIP2C 137 


4618 


1547 


3333 " - 


5119 


6905 


784CIP2C 138 


4620 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 

_ 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SE& ID 
NO: in 
U.S. S.N. 
09/488, 725 


1548 


3334 


5120 


6906 


7B4CIP2C_139 


4624 


1 CAQ 
1 J^J 


.3335 


5121 


6907 


784CIP2C 140 


4632 


1550 


3336 


5122 


6908 


784CIP2C 141 


4634 


1551 


333 7 


5123 


6909 


784CIP2C 142 


4638 


1552 


3 33 8 


5124 


6910 


784CIP2C_143 


4639 


1553 


JJJ3 


5125 


6911 


784CIP2C 144 


4643 


1554 


3340 


5126 


6912 


784CIP2C 145 


4644 


1555 


3341 


5127 


6913 


784CIP2C 146 


4655 


1556 


Jj4& 


5128 


6914 


784CIP2C__147 


4668 


1557 


3343 


5125 


6915 


784CIP2C 148 


46*77 


1558 


3344 


5130 


6916 


784CIP2C 149 


4677 




3345 


• 5131 


6917 


784CIP2C 150 


4677 


13DU 


3 346 


5132 


6918 


784CIP2C_152 


4682 


1561 


3347 


5133 


6919 


784CIP2C 153 


4690 


1562 


3348 


5134 


6920 


784CIP2C 154 


4691 


1563 


3349 


5135 


6921 


784CIP2C 155 


4727 


1564 


3350 


5136 


6922 


784CIP2C_156 


4730 


1565 


3351 


5137 


6923 


784CIP2C 157 


4734 


1566 


3352 


5138 


6924 


784CIP2C 158 


4757 


1567 


3353 


5139 


6925 


784CIP2C 159 


4764 


1568 


3354 


5140 


6926 


784CIP2C 160 


4786 


1569 


3355 


5141 


6927 


784CIP2C 161 


4793 


1570 


3356 


5142 


6928 


784CIP2C 162 " 


4825 


1571 


3357 


5143 


6929 


784CIP2C 163 


4826 


1572 


3358 


5144 


6930 


784CIP2CJ.64 


4850 


1573 


3359 


5145 


6931 " 


784CIP2C 165 


4853 


1574 


3360 


5146 


6932 


784CIP2C 166 


4855 


1575 


3361 


5147 


6933 


784CIP2C 167 


4856 


1576 


3362 


5148 


6934 


784CIP2C 168 


4867 


1577 


3363 


5149 


6935 


784CIP2CJL69 


4869 


1573 


3364 


5150 


6936 


784CIP2C 170 


4878 


1579 


3365 


5151 


6937 


784CIP2C 171 


4880 


1580 


3366 


5152 


6938 


784CIP2C_172 


4942 


1581 


3367 


5153 


6939 


784CIP2C 173 


4945 


1582 


3368 


5154 


6940 


784CIP2C_174 


4950 


1583 


3369 


5155 


•6941 


784CIP2C 175- 


4952 


1584 


3370 


5156 


6942 


784CIP2C 176 


4954 


1585 


33 71 


5157 


6943 


784CIP2C_177 


4958 


1586 


3372 


5158 


6944 


784CIP2C 178 


[_ 4961 


1587 


3373 


5159 


6945 


784CIP2C 179 


5590 


15"8"8 


3374 


5160 


6946 


784CIP2C 180 


5599 


1589 


3375 


5161 


6947 


784CIP2C 181 


5692 


1590 


33 76 


5162 


6948 


784CIP2C 182 


5732 


1591 


33 77 


5163 


6949 


784CIP2C 183 


5765 


1592 


33 78 


5164 


6950 


784CIP2C 184 


5771 


15 93 


jj fit 


5165 


6951 


784CIP2C 18S 


5774 


1594 


.5 J 0 U 


5166 


6952 


784CIP2C 186 


5793 


1595 


33 81 


5167 


6953 


784CIP2C 18 7 


5806 


i3?D 


33 82 


5168 


6954 


784CIP2C 188 


5852 


icon 

J.Di; / 


33 83 


5169 


6955 


784CIP2C 189 


5892 


1598 


3384 


5170 


69S6 


784C1P2C 190 


6057 


1599 


3385 


5171 


6957 


784CIP2C 191 


6061 


1600 


3386 


Ki n-> 

3 X f Z 


6958 


7B4CIP2C 192 


6109 


1601 


3387 


5173 


6959 


784CIP2C_193 


6160 


1602 


3388 


5174 


6960 


784CIP2C 194 


6297 


1603 


3389 


5175 


6961 


784CIP2C 195 


6398 


1604 


3390 


5176 


6962 


7S4CIP2C 196 


629Q 


1605 


3391 


5177 


6963 


784CIP2C 197 


6415 


1606 


3392 


5178 


6964 


784CIP2C 198 


6448 


1607 


3393 


5179 


6965 


784CIP2C 199 


6469 


1608 


3394 


5180 


696£ 


784CIP2C 200 


6476 


1609 


339S 


5181 


6967 


784CIP2C 201 


6561 
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SEQ ID NO : 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence. 


SEQ ID NO: 
Of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority- 
docket number__ 
corresponding 
SEQ ID NO: in 
priority- 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


1610 


3396 


5182 


6968 


784CIP2C_2 02 


6574 


1611 


3397 


5183 


6969 


784CIP2CJ203 


6578 


1612 


3398 


5184 


6970 


784CIP2C 204 


6662 


1613 


3399 


5185 


6971 


784CIP2C 205 


6572 


1614 


3400 


5186 


6972 


784CIP2C 206 


6691 


1615 


3401 


5187 


6973 


784CIP2C_207 


6695 


1616 


3402 


5188 


6974 


784CIP2C 208 


5746 


1617 


3403 


S189 


6975 


784CIP2C 209 


6898 


1618 


3404 


5190 


6976 


784CTP2C_210 


6938 


1619 


3405 


5191 


6977 


784CIP2C 211 


6943 


1620 


3406 


5192 


6978 


784CIP2C_212 


7110 


1621 


3407 


5193 


6979 


784CIP2C 213 


7200 


1622 


3408 


5194 


6980 


784CI?2C_214 


7212 


1623 


3409 


5195 


6981 


784CIP2C 215 


7218 


1624 


3410 


5196 


6982 


784CIP2C 21* 


7249 


1625 


3411 


5197 


6983 


784CIP2C 217 


7500 


1626 


3412 


5198 


6984 


784CIP2C_218 


7509 


1627 


3413 


5199 


6985 


784CIP2C_219 


7523 


1628 


3414 


5200 


6986 


784CIP2C 220 


7544 


1629 


3415 


5201 


6987 


784CIP2C 221 


7564 


1630 


3416 


5202 


6988 


784CIP2C_222 


7568 


1631 


3417 


5203 


6989 


784CIP2C 223 


7631 


1632 


3418 


5204 


6990 


784CIP2C 224 " 


7813 


1633 


3419 


5205 


6991 


784CIP2CJ225 


7831 


1634 


3420 


5206 


6992 


784CIP2C 226 


7843 


1635 


3421 


5207 


6993 


784CIP2C_227 


7907 


1636 


3422 


5208 


6994 


784CIP2C_228 


7943 


1637 


3423 


5209 


6995 


784CIP2C 229 


8175 


163 8 


3424 


5210 


6996 


784CIP2C_230 


8216 


1639 


3425 


5211 


6997 


784CIP2C_231 


8225 


1640 


3426 


5212 


6998 


784CIP2C_232 


8271 


1641 


3427 


5213 


6999 


784CIP2CJ233 


8397 


1642 


3428 ' 


5214 


7000 


784CTP2C_234 


S466 


1643 


3429 


5215 


7001 


784CIP2C_235 


8503 


1644 


3430 


5216 


7002 


784CIP2C_236 


8953 


1645 


3431 


5217 


7003 


784CIP2C_237 


9106 


1646 


3432 


5218 


7004 


784CIP2C 238 


9139 


1647 


3433 


S219 


7005 


784CIP2C_239 


9555 


1648 


3434 


5220 


7006 


784CIP2C_24G 


9650 


1649 


3435 


5221 


7007 


784CIP2CJ241 


9889 


1650 


3436 


5222 


7008 


784<SiP2C_242 


$933 


1651 


3437 


5223 


7009 


784CIP2C_243 


9953 


1652 


3438 


5224 


7010 


784CIP2C_244 


9981 


1653 


3439 


5225 


7011 


784CIP2D__1 


746 


1654 


3440 


5226 


7012 


784C1P2D 2 


3558 


1655 


3441 


5227 


7013 


784CIP2D 3 


3558 


1656 


3442 


5228 


7014 


784CIP2D_4 


3633 


1657 


3443 


5229 


7015 


784CIP2D_5 


3658 


1658 


3444 


5230 


7016 


784CIP2D_6 


3732 


1659 


3445 


5231 


7017 


784CIP2D 7 


4004 


1660 


3446 


5232 


7018 


784CIP2D 8 


4700 


1661 


3447 


5233 


7019 


784C1P2D 9 


4703 


1662 


3448 


5234 


7020 


784CIP2D 10 


4774 


1663 


3449 


5235 


7021 


784CIP2D 11 


4894 


1664 


3450 


- 5236 


7022 


784CIP2D_12 


4918 


1665 


3451 


5237 


7023 


784CIP2D_13 


5159 


1666 


3452 


5238 


7024 


784CIP2D 14 


7443 


1667 


3453 


5239 


7025 


784CIP2D 15 


8673 


1668 


3454 


5240 


7026 


784CIP2D 16 


8679 


1669 


3455 


5241 


7027 


784CIP2D 17 


8727 


1670 


3456 


5242 


7028 


784CIP2D 18 


8734 


1671 


3457 


5243 


7029 


784CIP2D 19 


8756 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO; 


SEQ ID 


Priority- 


SEQ ID 


of full- 


NO: Of 


of contig 


NO; 


docket number__ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




1672 


3458 


5244 


7030 


784CIP2D 20 


8818 


1673 


3459 


5245 


7031 


784CIP2D_21 


8844 


1674 


3460 


5246 


7032 


784CIP2D_22 


8846 


1675 


3461 


5247 


7033 


784CIP2D_23 


8912 


ID / O 


3462 


5248 


7034 


784CIP2D 24 


8918 


J.D / / 


3463 


5249 


7035 


784C3P2D_25 


8918 


1 C*7Q 


3464 


5250 


7036 


784CIP2D_26 


8941 


1 C*7CI 

±o / y 




5251 


7037 


784CIP2D 27 


8941 


icon 


■54oo 


5252 


7038 


784CIP2D_28 


8951 


1681 


3467 


5253 


7039 


784CIP2D___29 


8951 


1682 


3468 


5254 


7040 


784CIP2D_30 


9007 


1683 


3469 


5255 


7041 


784CIP2D_3l 


9012 


1684 


3470 


5256 


7042 


784CIP2D 32 


9013 


168S 


3471 


5257 


7043 


784CIP2D_33 


9025 


1686 


3472 


5258 


7044 


784CIP2D_34 


9053 


1687 


3473 


5259 


7045 


7B4CIP2D_3S 


9054 


1688 


3474 


5260 


7046 


784CJP2D 36 


9054 


1689 


3475 


5261 


7047 


784CIP2D 37 


9113 


1690 


3476 


5262 


7048 


784GIP2D 38 


9134 


1691 


3477 


5263 


7049 


784CIP2D 39 


9152 


1692 


3478 


5264 


7050 


784CIP2D_40 


9152 


1693 


3479 


5265 


7051 


784CIP2D 41 


9211 


1694 


3480 


5266 


7052 


784CIP2D_42 


9223 


1695 


3481 


5267 


7053 


784CIP2D 43 


9223 


1696 


3482. 


5268 


7054 


7 84CIP2D_44 


9231 


1697 


3483 


5269 


7055 


784CIP2D 45 


9236 


1698 


3484 


5270 


705G 


784CIP2D_46 


9236 


1699 


3485 


5271 


7057 


784CIF2D_47 


9303 


1700 


3486 


S272 


7058 


784CIP2D 48 


9309 


1701 


34B7 


5273 


7059 


784CIP2D 49 


9314 


1702 


3488 


5274 


7060 


784CIP2D 50 


9326 


1703 


3489 


5275 


7061 


784CIP2D_S1 " 


9339 


1704 


34 90 


5276 


7062 


784CIP2D 52 


9348 


1705 


3491 


5277 


7063 


784CIP2D 53 


9376 


1706 


3492 


5278 


7064 


7 84CIP2D_54 


9382 


1707 


3493 


5279 


7065 


784CIP2D 55 


9407 


1708 


3494 


5280 


7066 


784CIP2D_56 


9414 


1709 


349S 


5281 


7067 


7 84CIP2D_S7 


9439 


1710 


3496 


5282 


7068 


784CIP2D 58 


9485 


1711 


34 97 


5283 


7069 


784CIP2D_59 


94 93 


1712 


34 98 


5284 


7070 


784CIP2D_60 


9501 


JL f X3 


349 9 


5285 


7071 


784CIP2D — 61 


9526 


1714 


JDUU 


5286 


7072 


784CIP2D 62 


9526 


1715 


3501 


5287 


7073 


784CIP2D 63 


9551 


1716 


3 502 


5288 


7074 


784CI?2D_64 


9557 


1717 
X / X f 


-33U3 


5289 


7075 


784CIP2D 65 


9568 


1718 


3504 


5290 


7076 


784CIP2D 66 


9588 


1719 




5291 


7077 


784CI?2D_67 


9597 


1 1 1 n 
X f c, u 


3506 


5292 


7078 


784CIP2D 6 8 


9615 


1721 


-acn'7 


5293 


7079 


784CIP2D_69 


9628 


1722 


3508 


5294 


7080 


784CIP2D 70 


9649 


1723 


3509 


5295 


7081 


784CIP2D_71 


9652 


1724 


3510 




7082 


784CIP2D 72 


9660 


1725 


3511 


5297 


7083 


784CIP2D_73 


9662 


1725 


3512 


5298 


7084 


7B4CIP2D_74 


9725 


1727 


3513 


5299 


7085 


784CIP2D 75 


9746 


1728 


3514 


5300 


7086 


784CIP2D 76 


9777 


1729 


3515 


5301 


7067 


784CIP2D 77 


9787 


1730 


3516 


5302 


7088 


784CIP2D 78 


9790 


1731 


3517 


53 03 


7089 


784CIP2D 79 


9842 


1732 


3518 


5304 


7090 


784CIP2D_80 


9842 


1733 


3519 


53 05 


7091 


784CIP2D 81 


9848 



298 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
MO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority- 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1734 


3520 


5306 


7092 


784CIP2D 82 


9867 


1735 


3521 


5307 


7093 


784CIP2D 83 


, 10010 


1736 


3S22 


5308 


7094 


784CIP2D 84 


10011 


1737 


3523 


5309 


7095 


784CIP2D_85 


10052 


1738 


3524 


5310 


7095 


784CIP2D 86 


10057 


1739 


3525 


5311 


7097 


784C2P2D_S7 


10085 


1740 


3526 


5312 


7098 


784CIP2D 89 


10139 


1741 


3527 


5313 


7099 


784CIP2D_90 


10142 


1742 


3528 


5314 


7100 


784CIP2D 92 


10165 


1743 


3529 


5315 


7101 


784CIP2D 93 


10173 


1744 


3530 


5316 


7102 


784CIP2D 94 


10173 


1745 


3531 


5317 


7103 


784CIP2D_95 


10273 


1746 


3532 


5318 


7104 


784CIP2E 1 


3121 


1747 


3533 


5319 


7105 


784CIP2E 2 


3628 


1748 


3534 


5320 


7106 


784CIP2E 4 


3673 


1749 


3535 


5321 


7107 


784CIP2E_5 


4018 


1750 


3536 


5322 


7108 


784CIP2E 6 


4467 


1751 


3537 


5323 


7109 


784CIP2E 7 


4865 


1752 


3538 


5324 


7110 


784CIP2E_8 


4916 


1753 


3539 


5325 


7111 


i 784CIP2E 9 


4923 


1754 


3540 


5326 


7112 


784CIP2E_10 


4926 


1755 


3541 


5327 


7113 


784CIP2E 11 


4962 


1756 


3542 


5328 


7114 


784CIP2EJL2 


4963 


1757 


3543 


5329 


7115 


784CIP2E_13 


4964 


1758 


3544 


5330 


7116 


784CIP2E_14 


4988 


1759 


3545 


5331 


7117 


784CIP2E 15 


5835 


1760 


3546 


5332 


7118 


784CIP2E 16 


7682 


1761 


3547 


5333 


7119 


784CIP2E_17 


7682 


1762 


3548 


5334 


7120 


784CIP2EJL8 


7€i99 


1763 


3549 


5335 


7121 


784C1P2E_19 


7707 


1764 


3550 


5336 


7122 


784CI?2E_20 


7707 


1765 


3551 


5337 


7123 


784CIP2E 21 


7752 


1766 


3552 


5338 


7124 


784CXP2E_22 


8357 


1767 


3553 


5339 


7125 


784CIP2E 23 


9065 


1768 


3554 


5340 


7126 


784CIP2E 24 


9324 


1769 


3555 


5341 


7127 


784CIP2F 1 


2976 


1770 


3556 


5342 


. 7128 


784CIP2F_2 


3 559 


1771 


3557 


5343 


7129 


784CIP2F_3 


4021 


1772 


3558 


5344 


7130 


784CIP2F 4 


4474 


1773 


3559 


5345 


7131 


784CIF2F 5 


4566 


1774 


3560 


5346 


7132 


784CIP2F_6 


4705 


1775 


3561 


5347 


7133 


784CIP2F_7 


4707 


1776 


3562 


5348 


7134 


784CIP2F 8 


4712 


1777 


3563 




7135 


A /"i T TTi ri ft 

7o4CIP2F_9 


5008 


1778 


3564 


5350 


7136 


784CIP2F_10 


5009 


1779 


3565 


5351 


7137 


7B4CIP2F_11 


5015" 


1780 


3566 


5352 


7138 


7B4C1P2? 12 


5015 


1781 


3567 


5353 


7139 


784CIP2F_13 


7724 


1782 


3568 


5354 


7140 


784CIP2F 14 


7725 


1783 


3569 


5355 


7141 


784CIP2F_15 


882B 


1784 


3570 


5356 


7142 


784CIP2F 16 


8830 


1785 


3571 


5357 


7143 


784CIP2F_17 


9739 


1786 


3572 


5358 


7144 


784CIP2F_18 


9896 



TRADOCS: 14 16247.1 (%CS701 !.D0C) 
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TABLE 7 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


predicted end 
nucleotide 
location 
corresponding 
to first 

ciiuxxiw auxu 

residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=*Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K«Lysine, 
^Leucine; M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamme, Rs=Arginine, 
S=Serine, T -Threonine, VsValine, 
W-Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5359 


337 


1131 


AHLSAR^SALI LDEVAILPAPQNLS VLSTNMKHLLMWS P VIAPG 
ETVYYS VEYQGEYESLYTSHIW I PS SWCSLTEGP3CDVTDDITA 
TVPYNLRVRATLGSQTS /CLEHP /VS I PLIETQPSLPDL/RME I 
TKDGFHLVIELEDLGPQFEFLVAYWRREPGAEEHVKMVRSGGIP 
VHLE TME PGAAY C VKAQTFVKA I GRYS AFS QTECVEVQGEAIPL 
VLAL FAF VGFML I L WVPL FVWKMGRLLQ / YLLL PRGGS S QTPW 
KITQF 


53S0 


2 


1115 


PRVRS SGGQEDPASQQWARPRFTQPSKMRRRVI ARPVGS S VRLK 
CVAS GHPRPDI TWMKDDQAI/rRPEAAEPRKKKWTIiSIiKNIiRPED 
S GK Y T CR VSNRAGAI NAT YKVD V I QRTR S K P VL TGTH P VNT T VD 
FGGTTS FQCKVRSDVKPVIQWLKRVEYGAEGRHNSTI D VGGQKF 
WLPTGDVWSRPDGS Y3^NKLLITRARQDDAGMY I CLGANTMG YS 
FRSAFLTVLPDPKPPGPPVASSSSATSLPWPWIGIPAGAVFIL 
GTLLLWLCQAQ KKP CTPAP AP PLPGHR? PGTARDR SGDKDLP SL 
AALSAGPGVGLCEEHGSPAAPQHLLGPGP VAG P KL YPKLYTGHS 
TPHTYTHPPFSCQLNSSHS 


536X 


3 


925 


HEGSISSANIIiLDDQFQPKliTDFAMAHFRSHLEHQSCTIWMTSS 
SSKELWYMPEEYIRQGKLSIKTDVYSFGIVIMEVLTGCRWLDD 
PKHIQLRDLLRELMEKRGLDSCLSFLDKKVPPCPRMFSAKLFCL 
AGRCAATRAKLRPSMDEVLWTLESTQASLYFAEDPPTSIiKSFRC 
PSPLFLENVPS 1 PVEDDESQKNNLLPSDEGLRIDRMTQKTPFEC 
S QSE VM FLSLDKKPES KKNEEACNM PSSSCEESWF PKY I VPS QD 
LRP YKVNIDPSSEAPGHS CRSRPVES SCSSKFS WD5 YEQYKKE 


5352 


2 


4 8 79 


SCQ VEGCTRTYNSSQS IG KHMKTAHPDQ YAAFKMQRKS KKGQXA 
NNLKTPNNGKFVYFljPSPVNSSNPFFTSQTKANGNPACSAQLQH 
VS P P I F P AULAS VSTPLL S SMES V I N PK I TSQD KNSQGGMI*CS Q 
MENLPSTALPAQMEDLTKTVLPLNIDRGSDPPLSLPAESSSIDL 
FPSPADSGTNSVFSQLEKNTNHYSSQIEGNTNSSFIjKGGNGENA 
VFPSQVNVA^FSSTNAQQSAPEBCVKKDRGRGQTGKERKPKHKTK 
RAKKPAI IRDGKF I CSRCYRAFTNPRS LGGHLS KRS YCKPLDGA 
EIAQELLQSNGQPSLLASMILSTNAVNLQQPQQSTFNPEACFKD 
P S FLQLliAENRS PAFLPNT F PRSGVTN FNTS VSQEGS E 1 1 IQAL 
ETAG IPS TFEGAEMIjSHVSTGCVSD AS QVNATVM PNPTVP P LLH 
TVCHPNTLLTNQNRTSNS KTSS I BECSS LPVFPTNDXiLXiKTVEN" 
GLCSSSFPNSGGPSQNFTSNSSRVSVISGPQNTRSSHLNKKGNS 
ASKRRKKVAPPI.IAPNASQNLVTSDLTTMGLIAKSVEIPTTNLH 
SNVI PTCEPQS LVENLTQKLNNVNNQLFMTD VKENF KTS LESHT 
VLAPLTL KTENGDS QMMALNS CTTSVNS DLQ 1 3 EDNV I QNFEKT 
LEIIKTAMMSQILEVKSGSQGAGETSQNAQINYNIQLPSVKTVQ 
NNKLPDSSP\FSSFISVMPTESNIPQSE\VSHKEDQIQEILEGI* 
QKLKLEKBLSTPASQCVLINTSVTLTPTPVKSTADITVIQPVSE 
MINIQFM3KA/NKPFVCQNQGCNYSAMTKDALFKHYGKIHQYTPE 
MILEIBCKNQLKFAPFKCWPTCTKTFTRWSNLRAHCQLVHHFTT 
EEMVKLKIKRPYGRKSQSENVPASRSTQVKKQLAMTEENKKESQ 
PALELRAETQNTHSNVAVIPEKQLIEKKSPDKTESSLQVITVTS 
EQCNTNALTNTQTKGRKIRRHKKEKEEKKRKKPVSQSLEFPTRY 
SPYRPYRCVHQGCFAAFTIQQNLILHYQAVHKSDLPAFSAEVEE 

HEMTPEE I E S MTAS VD VGK FP CDQLEC KS S FTTYLNYWHLEAD 
HGIGLRASKTEEDG VYKQ3CEGCDRI YATRS NIXRHI FKTKHNDK 
HKAHLIRPRRLTPGQENMSSKANQEKSKSKHRGTKHSRCGKEGI 
KMPKTKRKKKNNLENKNAK1 VQIEENKPYSLKRGKHVYS IKARN 
DALSECTSRFVTQYP CMIKGCTSWTSESNI IRH YKCHKLSKAF 
TSQHRNLLIVFKRCCNSQVKETSEQEGAKNDVKDSDTCVSESND 
NSRTTATVSQKE VEKNE* DEMDELTETj FITKLINEDSTS VETQA 
NTSSNVSNDFQEDNL CQSERQKASNIiKRVUKEKNVS QNXKRKVE 
KAEPASAAELSSVRKEEETAVAIQTIEEHPASFDWSSFKPMGFE 
VSFLKFLEESAVKQKKNTDKDHPNTGNKKGSHSNSRKNIDKTAV 
TSGNHVCPCKESETFVQFANPSQLQCSDNVKIVLDKNLKDCTEL 
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Amino acid segment containing signal peptide 
(A^Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«= Phenylalanine, G-Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M«=Methionine, N=Asparagine, 
P- Proline , Q=Glut amine, R=Arginine, 
S-Serine, T -Threonine, Wvaline, 
W=Tryptophan, Y^Tyrcsine, X=TJnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ViKQfcQEMKPTVSLKKLEVHSNDPDMSVMKDIS IGKATGRGQ Y 


5363 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRASLPHRIiNML 
RGPGPGLLLLAVLCLGTAVPSTGASKSKROAQQMVQPQSPVAVS 
QS K PGCYDNGKHYQ INQQWERTYLGNALVC TCYGGS RGFNCE S K 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNIjIiQCICTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DS GWYS VGMQLA* KTQGNKQML \ CTCLGNGVS CQETAVTQTYG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
KYSFCTDHTVI*VQTRGGNSNGALCHFPFLYIJNHNYTDCTSEGRR 
DNMKWCGTTQN YDADQ KFGF C PMAAHEE I CTTNEGVMYRIGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 
DTFHKRHEEGHMLNCTCFGQGRGRWKCD PVDQCQDSETGT FYQ I 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPIiQTYPSSSGPVEVFI 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLNSYTIKGLKPGWYEGQLISIQOYGHQEVTRFDFTTTSTST 
P VTSNT\VTGETTP FSPLVATSES VTEITAS S F WS WVSASDTV 
SGFRVEYELSEEGDEPQYLVLPSTATSV\NIP\DLLPGRKYIVN 
VYQISEDGEQSLILSTSQTTAPDAPPDPTVDQVDDTSIWRWSR 

pqapitgyrivyspsvegsstelnlpetansvtiisdtiqpgvqyn 
itiyaveenqestpwiqqettgtprstjtvpsprdlqfvevtdv 
kvtikwtppesavtgyrvdvipvnlpgehgqrlplsrntf\aen 
tg1»s pgvt yy f kv favshgr e s kpltaqqttkl\daptnlqf vn 
etdstvlvrwtppraqitgyrltvgltrrgqprqynvgpsvsky 
plrnlqpas eytvslvai kgnqespkatgvfttlqpgs s i ppyn 
tevtetti v itwtpapri gfklgvrpsqggeaprevtsdsgs i v 
vsgltpgveyvytiqvlrdgqernap\ivnk\wtplspptnlh 

LEAWPDTGVIjTVSWERSTTPD ITGYRITTTPTNGQQGNSIiEEW 
HADQSSCTF\DNIiEVPGLSYNVSVYTVKDDKESVPI SDTI IPAV 
PP FTDLRFTN/ ILGPDTMRVTW\APPPS IDLTNFLVRYS PVKNE 
GRMLQSLS I FFLSDN\AWLTNLLPGTEYWSVSSVYEQHESTP 
\LRGRQKTGLDSP\TGIDFS\DITA\NSFT\VHW\ IAPRA/TP I 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNS3:TLTNLTPGTEYW 
S I VALNGREE S PLL IGQQSTVSDVPRDUE WAATPTSLLI \ SWD 
APAVTVRY YR I T YGETGGNSPVQE FTVPGS KSTATI SGLKPGVD 
YTITVYAVTGRGDS PAS S KP I S I NYRTE IDKPSQMQVTD VQDNS 
ISVKMLPSSSPVTGYRVTTT\PKWGPG\PTKTKTAGPDQTEMTI 
EGLQPrVEYWSVYAQWPSGESQPIiVQTAVTjPTIDRPKGrAFTDV 
DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLRPGSEYTVSWAIiHDDMESQPLIGTQSTAIPAPTDLKFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
SWVSGLMVATKYEVSVYALKDTLTSRPAQGWTTLENVSPPRR 
ARVTDATETTrTXSWRTKTETITGFQVDAVPANGQTPIQRTIKP 
D VRS YTI TGLQ PGTD YKI YLYTLNDNARSS P WI DAS TAIDAPS 

■VTT DE?T »\ T"T»*11lt f T T wCMr\DT5Di\nT'Pr , VTTWt , vn/ , OlDT5Dt , irtfn!D'n 

JMbKrljAl I PNSLiDVoWQPr'RARlT.uJL liKiJiKFobr'rK&VVfc'RF 
RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 
QLVTLPHPNLHGPE I LDV PSTVQKTPFVTHPG YDTGNG I QL PGT 

ALSQTT I S WAP FQDTS E Y I ISCHPVGTDEEPLQ FRVPGTS TS AT 
LTGLTRGATYNIIVEALKDC3QRHKVREEVVTVGNSVNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKLI.CQCLGFGSGHFRCD 
SSRWCHDNGVWYKIGEKWDRGGENGQMMSCTCLGZ«3KGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGEPS PEGTTGQS YNQYSQRYHQRTNTNVNCP IECFMPLDVQ 
ADREDSRE 


5364 


8066 


703 


RL CCTGGGEGTPGASG KRGPAATTS LVXjCI PS VP PP VPFPTTiWP 
PPSWRRQPPGGIRRDFSRRtiRREANLVATCLPVRASLPHRLNMIi 
RGPGPGLIiLIAVLCJjGTAVPSTGASKSKRQAQQMVQPOSPVAVS 
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Amino acid segment containing signal peptide 
(A=Alanine, C- Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine r Ks=Lysine, 
L«Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, **stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QSKPGCyDWGKHYQIWOQWERTyLGNALVCTCydGSRGFNCESK 
P EAE ETCFDK YTGNTYRVGDT YER PKDSM I WDCTC IGAGRGR I S 
CTIANRCHEGGQSYKIGDTMRRPHETXK3YMLECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
I TCTSRNECNDQDTRTSYRIGDTWSKKDNRGNLIiQC 1 CTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYSVGMQLA*KTQGNKQML\CTCLGNGVSCQETAVTQTYG 
GNSNGEP CVLP FrYNGRTF YS CTTEG RQDGHLWCSTTSNYEQDQ 
KYSFCTDHTVLVQTRGGNSNGALCHFPFLYNNHNYTDCTSEGRR 
DNMKWCGTTQN YDADQ KFG FCPMAAHEE I CTTNEGVMYRIGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCI AYSQLRDQCI VDD 1 TYNVN 
DTFHKRHEEGHMLiNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TETPSQPWSHP IQWNAPQP3HI SKY I LRWRPKNSVGRWKEATIP 
GHLKS YTIKGLXPGWYEGQLI SIQQ YGHQEVTR FDFTTTSTST 
PVTSNT\VTGETTPFS PLVATSES VTEITASS FWSWVSASDTV 
SGFRVEYELS EEGDE PQ YLVLPSTATS V\NIP \DIiLPGRKYI VN 
VYQIS EDGEQSLI LSTSQTTAPDAPPDPTVDQVDDTS I WRWSR 
FQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDljQFVEVTDV 
KVTIMWTPPESAVlX^RVt)VIPVNriPGEHGQRLPLSRNTF\AEN 
TGLS PGVTY Y FKVFAVS HGR ESKPLTAQQTTKL \DAPTNLQFVN 
ETDSTVLVRWTPPRAQITGYRLTVGLTRRGQPRQYMVGPSVSKY 
PLRNLQPASE YTVSLVAI KGNQES PKATGVFTTLQPGS SIPPYN 
TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 
VSGZiTPGVE Y VYT I QVLRDGQERDAP \ I VNK\ WTPIjS PP TNUH 
LEAJJP DTGVLTYS WERS TTP DI TGYR ITTTPTNGQQGNSL EE W 
HADQSSCTF\DNLEVPGLEYUVSVYTVKDDKESVPISDTIIPAV 
PPPTDLRFTN / 1 LGPDTMRVTW \AP PP S IDLTNFLVRYS P VKNE 
GRMLOSLSIFFLSD23\AWLTNIiLPGTSYVVSVSSVYEQHESTP 
\LRGRQKTGLDSP\TGIDFS\D1TA\WSFT\VHW\IAPRA/TPI 
TGYRIR\HHPEHF\2GRPREDR\VPHSRNSlTLTnLTPGTEYW 
SIVALNGREESPLLIGQQSTVSDVPRDLEWAATPTSLLI\SWD 
APAVTVRYYRITYGETGGNSPVQEFTVPGSKSTATXSGLKPGVD 
YTITVYAVTGRGDSPASSKPISIWYRTEIDKPSQMQVTDVQDNS 
I S VKWLPS SS P VTG YRVTTT\ P KNGPG\ PTKTKTAG PDQTEM TI 
EGLQPTVEYVVSVYAQNPSGESQPLVQTAVTNIDRPKGIjAFTDV 
DVDS I KIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLR PGSE YTVS WALHDDMESQPL I GTQSTAI PAPTDLKFT 
QVTPTS LSAQWTPPNVQLTGYRVRVTPKEKTGPMKE INLAPDSS 
SVWSGI>1VATKYEVSVYAIjKDTLTSRPAQGVVTTLENVSPPRR 
ARVTDATETT IT IS WRTKTETI TGFQVDAVPANGQTP I QRT I KP 
DVRSYTITGLQPGTDYKIYLYTLNDNARSSPWIDASTAIDAPS 
NTLRFLATTPNSIiLVSWQPPRARITGYI I KYEKPGSPPREWPRP 
RPG VTEATITGLE PGTE YTI YVIALKNNQKSEP L I G R KKTDE LP 
QLVTLPH PNLHGPE I LDVPST VQKTP FVTHPGYDTGWGI QL PGT 
SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTT I S WAP FQDTSEY 1 1 S CHPVGTDEEPLQFRVPGTSTSAT 
LTGLTRGATYNI IVEALKDQQRHKVREEWTVGNSVNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKIiliCQCLGFGSGHFRCD 
S S RW CHDNGVNYKI GEKWDRQGENGQMMS CTCLGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGEPS PEGTTGQS YNQYSQRYHQRTNTNVNCPI ECFMPLD VQ 
ADREDSRE 


536!T ' 




703 


RLCCTGGGEGTPGASGKRG PAATTSLVLCI PS VPP PVPFPTLWP 
PPSHRRQPKK3IRROFSRRI»RREANLVATCIiPVRASLPHRIiNMIi 
RGPGPGLLLtiAVLCLGTAVPSTGASKSKROAQQMVQPQSPVAVS 
QS KPGCYDNGKH YQ I NQQWERT YIjGNALVCTCYGGSRGFNCES K 
PEAE ETCFDKYTGMT YRVGDT YE RPKDSM I WD CTC IGAGRGR I S 

ctianrcheggqsykigdtwrrphetggymlecvclgwgkgewt 

CKPIAEKCFDHAAGTSYVVGETWEKPYQGWMMVDCTCLGEGSGR 
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Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D-Aspartic Acid, Es 
Glutamic Acid, F= Phenyl alanine, GsGlycine, 
H=Histidine, I^Isoleucine, K-kysine, 
L=Leucine, M=Methionine, ST=Asparagine , 
P=Proline, Q^Glut amine, R~Arginine, 
S=Serine , T=»Threonine , V»Val ine , 
W^Tryptophan, Y^Tyrosine, X^Vnknown, **=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ITCTSRNRCNDQDTRTSYRIGDTWSfCKDWRGWLLQCiCTGNGR'G 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSG WYS VGMQLA* KTQGNKQML \ CTCLGNGVS CQE TAVTQT YG 
GNSNGEPCVLPPTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
K YS FCTDHT VL VQTRGGNS NGALCHF PFL YNNHNYTDCTS EGRR 

dnmkhcgttqnydadqkfgfcpmaahee icttnegvm yr igdqw 
dkqhdmghmmrctcvgngrgewtciaysqlrdqcivdditynvn 
dtfhkrheeghmlnctcfgqgrgrwiccdpvdqcqdsetgtfyqi 
guswekyvhgvryqcycygrgigewhoqplqtypsssgpvevfi 
tetpsqpnshpiqwnapqpshiskyilrwrpknsvgrwkeatip 
ghlnsyti kglxpgwyegqlis i qq yghqe vtr fdfttts ts t 
pvtsnt\ vtgettffs plvatses vteitas s f wsw vsasdtv 
sgfrveyelsebgdepq yhvlps tats v\ni p \ dllpgrkyi vn 
vyqisedgeqslilstsqttapdappdptvdqvddtsiwrwsr 
pqap i tgyr i vyspsvegsstelnlpetansvtlsdlqpgvqyn 
itiyaveenqesrpwiqqettgtprsdtvpsprdlqfvevtdv 
kvtimwtppesavtgyrvdvt:pvnlpgehgqrlplsrntf\aen 
tgls pgvt y yfkvfavs hgreskp ltaqqttkl \ daptnlqfvn 
etdstvlvrwtp praq i tg yrltvgltrrgqprqynvgps vs ky 
plrnlqpase ytvslvai kgmqes pkatgvfttxiqpgs sippyn 
tevtettxvitwtpaprigfkiigvrps qggbaprevtsds gs i v 
vsglt pgve yvytiqvlrdgqerdap\ i vnk\ wtpls pptnlh 
leanpdtgvltvswersttpditgyritttptngqqgnslesw 
hadqssctf\ dnlevpgleyws vytvkddkesvpisdti ipav 
ppptdlrftn/ ilgpdtmrvtm\appp sidltnfi*vrys p vk3te 
grmlqsls i fflsdn\awltniilpgt3 ywsvssvyeqhestp 
\lhgrqktglds p \tgidfs \dita\ns ft \vhw\ i apra/tpi 
tgyri r\hhpehf \ sgr predr\vphsrns itiitni/tpgte y w 

S1VALNGREESPIjLIGQQSTVSDVPRDLEWAATPT5IjIjI\SWD 
APAVTVRYYRITYGETGGNSPVQEFTVPGSKSTATISGLKPGVD 
YTITV YAVTGRGDS PAS S KP I S IN YRTE IDK PSQMQVTDVQDNS 
ISVKWLPSSSPVTGYRVTTT\PKNGPG\PTKTKTAGPDQTEMTI 
EGLQPTVEYWS VYAQNP S GESQPLVQTAVTNIDRP KGLAFTDV 
DVDSIKIAWESPQGOVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLRPGS E YT VS WALHDDMESQPLIGTQSTAX PAPTDLKFT 
OVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINIAPDSS 
S VWSGLMVATK YE VS VYALKDTLTSR PAQGWTTLENVS P PRR 
ARVTDATETT I TIS WRTKTETI TGFQVDAVP ANGQTP IQRT I KP 
DVRS YT I TGLQPGTDYKI YLYTLNDNARSSPWI DASTAIDAPS 
NLR FLATTPNS LLVS WQP PRAR I TG Y 1 1 KYEKPGS P PRE WPRP 
RPGVTEATITGLEPGTEYTIYVXALKNNQKSEPLIGRKKTDELP 
QLVTLPHPNLHGPErLDVPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQQPSVGQD^IIFEEHGFRRTTPPTTATPrRHRPRPYPPNVGQE 
ALSQTTISWAPFQDTSEYIISCHPVGTDEEPLQFRVPGTSTSAT 
LT^TRGATYNIIVEALKDQQRHKVREEVVTVGNSVWEGriNQPT 
DDSCFDPYTVSKYAVGDEWERMSESGFKLLCGCLGFGSGHFRCD 
SSRWCHDWGVNYKIGEKWCRQGENGQMMSCTCLGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGEPSPEGTTGQS YNQYSQR YHQRTNTNVNCP I ECFMPLDVQ 
ADREDSRE 


5366 


8066 


703 


RLCCTGGGEGTPGAS GKRG PAATTS LVLC I PS VPPPVP FPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANIiVATCIiPVRASLPHRtiNML 
RGPGPGLLLLAVLCLGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
QSKPGCYDNGKHYQINQQWERTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYVVGETWEKPYQGWMMVDCTCLGEGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNLIjQCICTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYSVGMQLA*KTQGNKQML\CTCLGNGVSCQETAVTQTYG 
GKSNGEPCVL,PFTYWGRrFYSCTTEGRQIX3HLWCSTTSNYEQDQ 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, E>=Aspartic Acid, E- 
Glutamic Acid, F«Phenylalan.ine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L^Leucine, M=Methionine, N=Asparagine , 
P»Proline , Q^Glutamine * R^Arginine , 
S^Serine, T«Threonine, V^Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) 








KYSFCTDHTVLVQTRGGWSNGAIiCHFPFLYNNH^YTbCTSEGRR 
DNMKWCGTTQNYDADQKFG F CPMAAHE E I CTTNEGVM YR I GDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAySQLRDQCIVDDlTYNVN 
DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TETPSQPNSHPIQWNAPQPSHIS ECYILRWRPKNSVGRWKEATI P 
GHLNSYTI KGLKPGWYEGQL IS I QQYGHQEVTRFDFTTT STST 
PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 
SGFRVEYELSEEGDEPQYLVLPSTATSV\NIP\DLLPGRKYIVN 

vyqi s edgeqsli ls tsqttapdappdptvdqvddts x wrwsr 
pqapitgyr i vysps vegsstelnlp3tans vtlsdlqpg vqyn 
i ti yaveenqestpwiqqettgtprsdtvps prdlqf vevtdv 
kvtimwtppesavtgyrvdvipvnlpgehgqrlplsrwtf\aen 
tglspgvtyyfkvpavshgreskpltaqqttkl\daptnlqfvn 
etdstvlvrwtppraqitgyrltvgltrrgqprqynvgpsvsky 
plrnlqpaseytvslvaikgmqespkatgvfttlqpgssippyn 
tevtettivitl^tpaprigfklgvrpsqggeaprevtsdsgsiv 
vsghtpg ve yvytiqvlrdgqerdap v ivnk\ wtpls pptnlh 
leanpdtg vltvs wers tt pd itg yr i ttt ptngqqgnsiiee w 
iiadqs s ctf \ dnlevpgle ynvs vytvkddkes vplsdt 1 1 fav 
ppptdlrftn/ ilgpdtmrvtw\ appp sidltnflvr ysp vkne 
grmlq53js i fflsdn \ awltnll pgtey ws vs s v yeqhe stp 

\ LRGRQKTGLDSP \TG I DFS \ DI TA\NS FT \ VHW \ I A PRA/ TP I 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITt»TNLTPGTEYVV 
S IVALNGREES PLLIGQQST VSD VPRDLE VVAAT PTS LLI \S WD 
APAVTVRYYRI T YGETGGNS P VQJEFTVPGS KSTATISGLKPG VD 
YTITVYAVTGRGDSPASSKPISINYRTEIDKPSQMQVTDVQDUS 
I SVKWLPSSSP VTGYRVTTT\ PKNGPG\ PTKTKTAGPDQTEMTI 
EGLQPTVEYWSVYAQNPSGESQPLVQTAVTNIDRPKGLAFTDV 
DVDS I KI AWE S PQGQVSRYRVT YSS PEDG IHELF PAPDGE EDTA 
ELQGLRPGSEYTVSWALHDDMESQPLIGTQSTAIPAPTDLKFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
S WVSGLMVATKYE VS VYALKDTLTS RPAQG WTTLENVS PPRR 
ARVTDATETT I T I S WRTKTETITGFQVDAVPAKGQTPI QRTI KP 
DVRSYTITGLQPGTDYKIYIiYTIjNDNARSSPVVIDASTAIDAPS 
NLRFLATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 
RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 
QLVTLPHPNliHGPEILDVPSTVQPCTPFVTHPGYDTGNGIQLPGT 
SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTTIS WAPFQDTSEYI IS CHPVGTOEEPLQFRVPGTSTSAT 
LTGLTRGATYWI IVEALKDQQRHKVREEWTVGNSVNEGLNQPT 
DDSCFDPYTVSKYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 
SSRWCHDNGVNYKIGEKWDRQGENGQMMSCXCLGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGE P S PEGTTGQS YNQ Y SQR YHQRTNTNVNCPIE CFMP LDVQ 
ADREDSRE 


5367 


235 


3591 


KKILNMLCKKNIVIEYIADILYEYLYGFCFSGIKKYLIIHVLRL ~ 

ILELWMTRLIiLEKS VS LQTQ YLIiIjIVKILS WFPGKEMRHHLQ I M 

EVMMRKQDS/RIVGNGSEQQU}KELADVLMDPPMDDQPGEKELV 

KRSQLDGEGDGPLSNQLSASSTINPVPLVGLQKPEMSLPVKPGQ 

GDSEASSPFTPVADEDSWFSKLTYLGCASVNAPRSEVEALRMM 

SILRSQCQI S LDVTLSVPNVSEGI VRLL^DPOTNTE IANYPI YKI 

LFC VRGHDGTPES DCFAFTE S H YNAELFR I HVFRCE IQEAVS R I 

LYS FATAFRRS AKQTP LS ATAAPQTPDS D I FTFS VS LE I KEDDG 

KGY FS AVP KDKDRQCFKLRQG I DKKI VI YVQQTTNKELAIERCF 

GLLLSPGKDVRNSDMHLLDLESMGKSSDGKSYVITGSWWPKSPH 

FQVVNEETPKDKVLFMTTAVDLVITEVQEPVRFLLETKVRVCSP 

NERLFWPFSKRSTTEWFFLKLKQIKQRERKNNTDTLYEWCLES 

ESERERRKTTASPSVRLPQSGSQSSVIPSPPSDDEEEDNDEPLL 

SGSGDVSKECAEKILETWGELLSKWHLNLNVRPKQLSSLVRNGV 

PEALRGEVTJQLLAGCHKNDHLVEKYRILITKESPQDSAITRDIN 
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Amino acid segment containing signal peptide 
(A= Alanine, OCysteine, D«Aspartic Acid, B» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HoHistidine, I=Isoleucine, K=Lysine, 
L=*Leucine, ^Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V-Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








RTF P AHD Y FKDTGGDGQD SLY KI CKAYS VYDEE IG YCQGQS FLA 
AVLLLHMPEEQAFSVLVKIMFDYGLRELFKQNFEDLHCKFYQLE 
RLMQE YIPDL YNHFLD IS LEAHM YASQWFLTLFTAKFPLYMVFH 
I IDLLLCBGIS VIFKVALGLLKTSKDDLLJ .TDFEGALKFPRVQL 
PKRYRSEENAKKLMELACNMKISQKKLKKYEKEYHTMREQQAQQ 
EDP I ERFERENRRLQEANMRLEQENDDIiAHELVTE KIALRKDLD 
NAEEKADALNK3LLMTKQKLIDAEEEKRRLEESSAHLKKMCRRE 
LDKAESEIKKNSSIIGDYKQICSQLSERLEKQQTANKVEIEKIR 
QKVDDCERCREFFNKEGRVKG 1 SSTKEVLDEDTDEEKETXK3SIQL 
REMELELAQTKL\QLVEA2CKIQD\LEHPF *GLPFNE\VQAA\ K 
KTWFNRTLS S I KTATG VQGKETC 


53S8 


573 


2014 


GAAAGAADPRRGSLGGRTMLDFAI FAVTFLLALVGAVLYLYPAS 
RQAAGI PGITPTEEKDGNLPD IVNSGSLHE FLVNLHERYG PWS 
FWFGRRLWSI*GTVDVLKQHINPWKTIjD/IiF*NHAEVI I KVS iw 
WWQCE * KP\ QRKKL YENGVTDSLKSNFALLLKLPEELLDKWIiS Y 

petqh\vplsqhmlgfamksvtqmvmgstfeddqevirfqknhg 
tvws e i gkgfldgsld knmtrkkq yedalmqle s vlrn 1 1 kerk 
grnfsqhifidslvqgni^ndqqiledsmifslasciitaklctw 
a i wfltts ee vqkkl yee i nqvfgwg p vtpek i eql r ycqh vlc 
etvrtakltpvsaqlqdiegkidrf i ipretlvlyalgwlqdp 
ntwp s phkfdpdrfddelvmktfs s lgfsgtqe cpelrfaymvt 

T VLLS VLVKRLHLLSVEGQVI ETKYELVTS S RES AW I TVS KRY 


5369 


1 


6622 


PRS LCFSLWAE AAVLiADGGLRRRRRLIiRGTMS AS FVPNGAS LED 
CHCNLFCI^LTGIKWKKYVWQGPTSAPlIiFPVTEEDPILSSFS 
RCLKADVIjG/VWRRDQRPERRENL* I FWGGEDP\ VLLTLFTMTY 
QKKKMECGRMDFPMNAVLCFSKAVHNLLERCLMNRNFVRIGKWF 
VKPYEKDEKP INKSEHLSCS FTFFLHGDSNVCTSVE INQHQPVY 
LLSEEHITLAQQSNS P FQVI LCP FGl»NGTLTGQAFKMS DS ATKK 
LIGEWKQFYPISCCLKEMSEEKQEDMDWEDDSLAAVEVLVAGVR 
MIYPACFVLVPQSDIPTPSPVGSTHCSSSCLGVHQVPASTRDPA 
MSSVTIiTPPTSPEEVQTVDPQSVQKWVKFSSVSDGFNSPSTSHH 
GGKI PRKIoANHVVDRVWQECNMNRAQNKRKYSASSGGLCEEATA 
AKVASWDFVEATQRTNCSCIiRHKNLKSRNAGQQGQAPSLGQQQQ 
ILPKHKTWEKQEKSEKPQKRPLTPFHHRVSVSDDVGMD\ADS\A 
SQRLV\rSAP\DSQ\VRFSNIR\TNDVAK\TPQMHGTEMANSPQ 
PPPLSP\HPCDWDEGVTKTPSTPQSQHFYQMPTPDPLVPSKPM 
EDRIDSIiSQSPPPQYQEAVEPTVYVGTAVNLEEDEANIAWKyYK 
FPKKKPVEFLPPQLPSDKFKDDPVGPFGQESVTSVTELMVQCKK 
PLKVSDELVQQ YQ I KNQCLSAIASDAEQEPK IDP YAFVEGDEEF 
LFPDfCKDRQNS EREAGKKHKVEDGTS S VTVLSHEEDAMSLFSPS 
IKQDAPRPTSKARP PSTS LI YDSDLAVS YTDLDNLFNSDEDELT 
PGS KRS ANGSDDKAS CKE SKTGNLDPLSC I STADLHKMYP TP PS 
LEQKIMGFSPMNMl^KEYGSMDTTPGGTVLEGNSSSIGAQFKIE 
VDEG FCS PKPS E 3 KDFS YVYKPENCQ I L VGCSMFAPLKTLPSQ Y 
LPL I KLPEEC I YRQS WTVGKLELLS S G PSMP PI KEGDGSNMD QE 
YGTAYTPQTHTS CGMPPSSAP P3NSGAG ILPSPSTPRFPTPRTP 
RTPRTPRGAGGPASAQGS VKYENSDLYS P ASTPS TCRPLNS VE P 
ATVPS I PEAHSLYVNLILSES VMNLFKDCNSDSCCI CVCNMNI K 
GADVG V Y I P DPTQEAQ YR CTCG FSAVMNRKFGNNSGLFFEDELD 
IIGRNTDCGKEAEKRFEALRATSAEHVNGGLKESEKLSDDLILL 

LEHGRQFMDNMSGGKVDEALVXSSCLHPWSKRNDVSMQCSQDIL 
RMLLS LQPVLQDAIQ KKRT VRP WGVQG PLTWQQ FHKMAGRGS YG 
TDESPEPLPIPTFLLGYDYDYLVLSPFAIiPYWERliMLEPYGSQR 
D IAYWLCPENEALLNGAKS KFRDLTAI YESCRLGQHRPVS RLL 
TDGIMRVGS TASKKLSEKLVAE W FS QAAD GNNEAFSKLKLYAQV 
CRYDLGPYLASLPLDSSLLSQPNLVAPTSQSLITPPQMTNTGKA 
WTPSATLASAASSTMTVTSGVAISTSVATANSTLTTASTSSSSS 
SNLNSGVSSNKLPSFPPFGSMWSNAAGSMSTQANTVQSGQLGGQ 
OTSALQTAGISGESSSLPTQPHPDVSESrMDRDKVGIPTDGDSH 
AVrYPPAI WYI IDPFT YEN TDESTNSSS VWTLGLLR CFLEMVQ 
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Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenyl alanine, G^Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=»Leucine, M-Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovm , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLP PHIKSTVS VQI I P CQYLLQP VKHBDRE I YPQHLKSLAFSAF 

FIIAPVKDKQrELGETFGEAGQKYr^LFVGYCLSHDQRWXLASC 
TDLYGBLLETCIINIDVPNRARRKKSSARKFGLQKLWEWCLGLV 
QM S S LP WRW I GRLGR I GHGELKD WS CLLS RRNLQS LS KRLKDM 
CRMCGI SAADS PS ILSACLVAME PQGSFVIMPD S VSTGS VFGRS 
TTLNMQTSQLNTPQDTSCTHILVFPTSASVQVASATYTTENLDL 
AFNPNNDGADGMGIFDLLDTGDDLDPDI IN X LPASPTGSPVHSP 
GSffi PHGtjJJACaXGQS rDKLiUSTEPHEEVPN I JjQQPLALGYFVST 
AKAGPLPDWFWSACPQAQYQCPLFLKASLHLHVPSVQSDELLHS 
KHSHPLDSNQTSDVLRFVLEQYNALSWLTCDPATQDRRSCLPIH 
FWLNQT..YNF r MNML 


5370 


1226 


716 


RWSRKI^LRRAAC^TBSRPPQSQEMHPPTGKJEVHALKRLRDSAN 
ANDVETVQQLLBIXSADPCAADDKGRTALHFASCNGNDQIVQLLL 
DHGADPNQRDGLGNTPLHIiAACTNHVPVITTLLRGGARVDALDR 
AGRTPLHLAKSKLN I LQEGHAQCLKAVR /HGGEADHP YAEGVSG 
APRAT*AARCSGVFPSPSRWLGSAPWSRSSCTIWSLPLHEAKCR 
AVRP LS SAAQGS APS S SS CCTVSTS LALAES LS LFRACTS L P VG 
GCISWL 


5371 


1331 


167 


IAAMLWKLLLRSQSCRLCS FRKMRS P PKYRP FLACF tYTTDKQS 
S KENTRTVE KL YKCS VD I RKIRR \ * KDGYF* RMKPMLKKLRI / F 
LQELGADETAVASILERCPEAIVCSPT7WNTQRKLWQLVCKNEE 
ELIKLIEQFPESFFTIKDQSNQKLNVQFFQELGLKNVVISRLLT 
AAP^FHNPVEKNKQMVRILQESYLDVGGSEANMKVWLLKLLSQ 
WPFIIjJjNSPTAIKEU I»Er liQEQGFTSFE ILQLXjSKLKGFLFQLC 
PRSIQNSISFSKNAFKCTDHDLKQLVLKCPALLYYSVPVLEERM 
QGLLREGI S IAQI RETPMVLELTPQI VQYRI RKLNSSGYRI KDG 
HLANLNGS KKE FE ANFGKIQAK KVRPLFNP VAP LNVEE 


5372 


51 


857 


SPGAQFLWAAPDMPDPLFSAVTQGKDEILHKALCFCPWLGKGGME 
PLRLL I LLF VTELSGAHNTTVFQGVAGQSLQ VS CPYDSM KHWGR 
RKAWCRQLGEKGP CQRWS THNLWLLS FLRRWNGSTAI TDDTLG 
GTLTI TLRNLQPHDAGLYQCQSLHGS B ADTLRKVLVE VLAD PLD 
HRDAGDLWFPG\DLRASRM?MWSTAS?GASWKEK3PSHPLPSFS 
SWPASFSSRF* QPAPSGLQPGMDRS QGHIHPVNWTVAMTQG I SS 
KLCQG 


5373 


2814 


346 


VKKT KS I FNS AMQEMEVY VEN I RRKFGVFNYS P FRTP YTPNSQY 
QMLL0PTNPSAGTAKIDKQEKVKLNFDMTASPKILMSKPVLSGG 
TGRR I SLSDMPRSPMSTNS SVHTGSDVEQDAEKKATSSHFSASE 
ESMDFLDKSTAS PASTKTGQAGSLSGS PKPFS PQLSAP I TTKTD 
KTSTTGS ILNLWLDRSKAEMDLKELSESVQQQSTPVPLIS PKRQ 
IRSRFQLNLDKT IESCKAQLGINE I SEDVYTAVEHSDSEDSEKS 
DS SDS E YI SDDEQKS * GTS QEDTED KE GCQMDKE PSAVKKKP KP 
TWPVEIKEELKSTSPASEKADPGAVKDKASPEPEKDFSGKAKPS 
PHP IKDKIiKGKuEXDS PTVHLGLDSDSE \NELVIDIX3EDHSGRE 
GRKNKKEPKE PS PKQDWGKTPPSTTVGSHSPPETPVIjTRSSAQ 
TSAAGATATTSTSSTVTVTAPAPAATGSPVKKQRPLLPKE\TAP 
AVQRS CGTS STVQQKE I TQS PSTST ITLVTSTQSSPLVTSSGSM 
STLVSSVNGDLPlGrASADVAADIAKYTSKL\MDAlKGTM\TEl 
YNDLS KPT\TTWKAQLAEDSQGLRIE IEKLQWLHQQEL\SEMKHN 
LELTMAEMRQSWEQERDRLIAEVKKQLE1EKQQAVDETKKKQWC 
ANFKKEAI F YCCWNTS YCD YPCQ \ QAHWPEH\MKS CTQSATAPQ 
\QEADAE\VNTETLNRSSQGSSSSTQSAPSETASA\SKEKETSA 
EKSKESGSTLDLSGSRETPSSILLGSNQGSDHSR\SNKSSWSSS 
DE KRGS \ TRSDHN/ T PS TQHGRS LL PG KES RAGTP FLGTS K 


5374 


2814 


346 


VKKTKS I FNSAMQEMEVYVENI RRKFGVFNYS PFRTP YTPNSQY 
QMLLDPTNPSAGTAKIDKQEKVKLNFDMTASPKILMSKPVLSGG 
TGRRISLSDMPRSPMSTNSSVHTGSDVEQtlAEKKATSSHFSASE 
ESMDFLDKSTASPAS TKTGQAGSLSGS PKPFS PQLSAP I TTKTD 
KTSTTGSILNLNLDRSKAEMDLKELSESVQQQSTPVPLISPKRQ 
IRSRFQriWLDKTIESCKAQLGINEISEDVYTAVEHSDSEDSEKS 
DSSDSEYISDDEQKS*GTSQEDTEDKEGCQMDKEPSAVKKKPKP 
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Amino acid segment containing signal peptide 
(AaAlanine, C=Cysteine, D-Aspartic Acid, E=« 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
L= Leuc ine , M*=Me t hi onine , N=Aspar agine , 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T«Threonine, V= Valine, 
W»Tryp tophan, Y-Tyrosine , X=Unknown , * =S top 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TNPVEIKEELKSTSPASEKADPGAVXDKASPEPEKDFSGKAKPS 
PHP I KDKLKGKDETDS PTVHLGLDSDSE\NELVIDLGEDHSGRB 
GRKNKKBPKEPSPKQDWGECTPPSTTVGSHSPPBTPVLTRSSAQ 
TSAAGATATTSTSSTVT VTAPAPAATGS P VKKQRPLLPKE \TAP 
AVQRSCGTSSTVQQKE ITQSPSTSTITLVTSTQSS PLVTS SGSM 
STt«VS S VNGDIjP igtasadvaadi AKYTSKL\MDAIKGTM\TE I 
YNDLS KN\TT WKAQLAEDSQGLR I E I EKLQWLHQQEL\SEMKHN 
LELrMAEMRQSWEQERDRLrAEVKKQLELEKQQAVDETKKKQWC 
ANFKKEAIPYCCWNTSyCDYPCQ\QAHMPEH\MKSCTQSATAPO 
\QEADAE\VNTETLNKSSQGSSSSTQSAPSETASA\SKEKETSA 
EKSKESGSTLDLSGSRETPSSILLGSNQGSDHSR\SNKSSWSSS 
DEKRGS\TRSDHN/TPSTQHGRSLLPGKESRAGTPFLGTSK 


5375 


2907 


1116 


HIFIiAEEEPMLERRCRGPLAMGPAQPRLLSGPSQESPQTLGKES 
RGI»RQQGTSVA\QSGAQAPGRAHRCAHCRRHFPGWVA\LWLHTR 
RCQA /RGLPL P CPECGRR FRHAP FLALHRQVHAAATPDWG FACH 
LCGQS FRGWVAI»VLHLRAHSAAKAGPFACPKMARDAFWRRKAAS 
SS I LRRCHPSRPRGPRPF I CGNCGRS I LPTWDQ / LKVAHKRVHV 
SRRP*ERGPPAKVFWGPRPRGPPTGDTPFGPGGDAVDRPF\QCA 

ccgkrfrhk\pwlrrshaactsgerphq/csrecg\krftnkpy 
lts\hrrithtarqpypckecgrrfrhkpnllshskihkrsegs 
aqaapgpx3spqlpagpqesaaeptpavplkpaqepppgappehp 
qdp ieappsi* ys cddcgrs frlerflrahqrqhtgerp ftcaec 
gknfgkkthlvahsrvhsgerpfrlarkcgrrflprasqsggrn 
sae pnaprfg p f vcpdcg kafrhkp ylaahrp i atpaekp yvcp 
dcrkafsqksnlWshrrihtgerpyacpdcdrsfsqksnlith 
rkshirdgafccaicgqtfddeerllahqkkhdv 


5376 


4504 


591 


VSTFSliCLWPAGGGGRGRVSNDOAQSKRHVYSRTPSGSRMSAEAS 
ARPLRVGSRVEV1GKGHRGTVAYVGATLFATGKWVGVILDEAKG 

KNDGTVQGRKYFTCDEGHGI fvrqsq iqvfedgadtts petpds 

SAS KVLKREGTDTTAKTS KLRGLKP KKAPTAR KTTTRRP KPTRP 
ASTGVAGASSSLGPSGSASAGELSSSEPSTPAQTPLAAPI I PTP 
VLTS PGAVPPLPS PSKEEEGLRAQ VRDLEEKLETLRLKRAEDKA 
KLKELEKHKIQLEQVQEWKSKMQEQQADIiQRRr,KEARKEAKEAL 
EAKERYMEEMADTADAIEMATLDKEMAEERAESLO^EVEAIiKER 
VDELTTDLEILKAEIEEKGSDGAASSYQLKQLEEQNARLKDALV 
RMRDLSSSEKQEHVK\LQKLMEKKNQELEWRQQRERLQEELSQ 
AEST3DELKEQVDAAI/5AEEMVEMLTDRNLNLEEKVRELRETVG 
DLEAMNEMNDELQENARBTEIjELREQLDMAGARVREAQKRVEAA 

qetvadyqqt i kkyrqltahlqdvnreiitnqqeas verqqqppp 
etfdfkikfaetkahakaiemelrqmevaqanrhmslltafmpd 

SFI>RPGGDHDCVLVI/IiLMPRIjICKAELIRKQAQEKFELSENCSE 

rpglrgaageqlsfaaigi»vy\slmpaaghryhry*chalsqcr 
ld\vykkvgsl y pems ah e rs ldfl i ellhkdqldetvnve p lt 
kaikyyqhlys ihlaeqpedcrmqiadhikftqsaldcmsvevg 
rlraflqggqeatdialllrdletscs \ dirqfckkirrrmpgt 
dapg1 paalapgpqvsdtlldcrkhltwwavlqe vaaaaaqli 
aplaenegllvaaleetafkaseqiygtpssspyeclrqscnil 
istmnk\l.vtamqege ydaerppskppp \velraaax.rae itda 
eglglki»e drbtv1 kelkks l ki kge els eanvrltllekklds 
aakdaderiekvqtriieeto^llrkkekefeetmdalqadidql 

EAEKAELKQRLNSQS KRTI EGIjRGPPPSGIATL VSGIAGEEQQR 
GAIPGQAP GS VPG PGLVKDS PLLLQQ I S AMRLH I SQ LQHENS I L 
KGAQMKASIASLPPLHVAKLSHEGPGSELPAGALYRKTSQLLET 
LNQLSTHTKWDITRTSPAAKSPSAQLMEQVAQLKSLSDTVEKL 
KDEVIiKETVSQRPGATVPTOFATFPSSAFLRAKBEQQDDTVYMG 
KVTFSCAAGFGQRHRLVLTQEQUiQLHSRLXS 


5377 


752 


1106 


DVPCICRVLPAEAQEKGQLTLSCGESGEEG \F* YHEVRQAEGES * 
/MFGPNVRLVHTQLKTKKPSGTLKAKFYLHTGSTKFAARISCTK 
SS*WPGYDGWWGGQYIFIFRGMRWEEQP 


5373 


2G09 


664 


QASGTTLRPLPDLPQLKI^EATSRNRJUjKPRGRIiVLMTSCLPAIi 
RFIATPRLSAMPHlDMDVKLDFKDVIiliRPKRSTLKSRSEVDLTR 
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Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S ^Serine , Ts»Threonine , V«Val ine , 
W=»Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






( 


S FS FRNSKQTYSG VP 1 I AANMDT VGTFEMAKVLCKS * V PGSFWJD 
VPQMGCVFLI YKLFTLKWKMLLLS VLLPAS ILVAEKFSL FTAVH 
KHYSLVQWQEFAGQNPDCLEHLAASSGTGSSDFEQLEQI leai p 
QVKYICLDVANGYSEHFVEFVKDVRKRFPQHTrMAGNWrGEMV 
EELILSGADI I KVGIGPGSVCTTRKECTGVGYPQIiSAVMECADAA 
HGLKGH 1 I SDGGCSCPGDVAKAFGAGADF VMLGGMLAGHSESGG 
ELIERDGKICYKLFYGMSS*I\AM\KKYAGGVAEYRASEGKTVEV 
PFKGDVEHTIRDILGGIRSTCTYVGAAKIjKELSRHTTFIRVTQQ 
VNPIFSEAC 


5379 


2009 


6 €4 


QASGTTLRPLPDLPQLKRREATSRNRALKPRGRLVLKTSCLPAL 
RFIATPRLSAMPHIDNDVtaDFKDVLLRPKRSTLKSRSEVDLTR 
S FS FRNS KQT YSG VP 1 1 AANMDT VGTFEMAKVLCKS * VPGS FWD 
VPQMGCVFLIYKLFTLKWKMLLIiSVIiLPASILVAEKFSLFTAVH 
KHYSLVQWQEFAGQWPDCLEHLAASSGTGS SDFEQLEQ ILEAI P 
Q VK Y I CLD VANG YS EH F VEF VKDVR KRFPQHTI MAGNWTGEMV 
EELILSGADI I KVGIGPGSVCTTRKKTGVGYPQLSAVMECADAA 
HGLKGHI I SDGGCS CPGDVAKAFGAGADFVMLGGMLAGHSESGG 
EI* IE RDG KKYKLF YGMS S * I \ AM\ KK YAGG VAE YRAS EGKTVE V 
PFKGDVEHT I RDILGG I RSTCT Y VGAAKLKELSRRTTF I RVTQQ 
VNPIFSEAC 


5380 


2 


2050 


P S RAGG AERGRAAAARS PGGS AAGWE CPS VLDEAGACTMSS C VS 
SQPS SNRAAPQDELGGRGSSSSESQKPCEAIiRGLSSLS IHLGME 
S F I WTECEPGCAVDLGLARDRP LE ADGQEVPLDTSGSQARPHL 
SGRKIiSLQERSQGGLAAGGSLDMNGRCI CPS LP YSP VS SPQSS P 
RLPRRPTVESHHVS ITGMQDCVQLNQYTLKDE IGKGS YGVVKLA 
YWENDNTYYAMKVLSKKKLIRQAAFPRRPPPRGTRPAPGGCIQP 
RGPI\EQVYQEIA\lLKKLDHPNW\KIiVEVL\DDPNEDHLYMV 
F\ELVNQGPVMEVPTLKPLSEDQARFYFQDLIKGIEYLHYQKII 
H \RDI KPSNLLVGEDGHIKIADFGVSNEFKGSDAL&SNTVGTPA 
FMAPESLSETRKIFSGKALDVWAMGVTLYCFVFG*CPFMDERIM 
CLHSKIKSQALEFPDQPD IAEDLKDLITRMLDXNPESRI WPE I 
KLHPWVTRHGAEPLPS EDENCTLVEVTEE EVENSVKHI PSLATV 
ILVXTMIRKRSFGNPFEGSRREERSLSAPGNLLTKKPTRECESL 
SELKT*KISPLPACCKVT*EFPHPSGCRPSCWQPPFLHTHSQPR 
*PEPPRTDEALCPYETGRTCWAPLLQVLWWVGT?LPFPLSTSWIi 
PDL VGAPGSHFCFLNI ALLRYNS HTM 


5381 


2 


2050 


PSRAGGAERGRAAAARS PGGSAAGWE CPS VLDEAGACTMSS CVS 
SQPS SNRAAPQDELGGRGS S S S E S QKPCEALRGLS SLS I HLGME 
SPIWTECEPGCAWLGIJUIDRPLEADGQEVPLBT2GSQARPHL 
SGRKLSLQERSQGGLAAGGSLDMNGRCI CPSLPYSPVS S PQSSF 
RLPRRPTVESHHVS I TGMQDCVQLNQYTLKDEIGKGSYGVVKXiA 
YNENDNTYYAMKVLSKKKLIRQAAFPRRPPPRGTRPAPGGC I QP 
RGP I \ EQVYQE IA\ILKKLDHPNW\KLVEVL\DDPNEDHLYMV 
F\ELVNC3GPVMEVPTLKPLSEDQARFYFQDLIKGIEYLHYQKII 
H \RD I KPSNLLVGEDGH I KI ADFGVSNE FRGSDALLSNTVGT PA 
FMAPES LSETRK I FSGKALDVWAMGVTLYCFVFG* CPFMDERIM 
CLHSKIKSQALEFPDQPD I AEDLKDL I TRMLDKWPESRIWPEI 
KLHPWVTRHGAEPLPSEDENCTIiVEVTEEEVENSVKHIPSLATV 
ILVKTMIRKRSFGNPFEGSRREERSLSAPGNLLTKKPTRECESL 
SELKT*KISPLPACCKVT*EFPHPSGCRPSCWQPPFLHTHSQPR 
*PEPPRTDEALCPYETGRTCWAPLLQVLWWVGTPLPFPLSTSWL 
PDLVGAPGSH FC FLNI ALLRYNSHTM - 


5382 


1536 


203 


GARGSQQDAP ALQE AE VRGPERAQ PARGRMTKARL FRLWLVLGS 
VFMILL I I VYWDS AGAAHF YLHT S FSR PHTGPP L PTPG PDRDRE 
LTADSDVDEFLDKFLSAGVKQSDLPRKETEQPPAPGSMBESVRG 
YDWSPRDARRSPDQGRQQAERRSVLRGFCANSSLAFPTKERPFD 
DIPNSELSHLIVDDRHGAIYCYVPKVACTNWKRVMIVLSGSLLH 
RGAPYRDPLRI PREHVHNAS AHLTFNK FWRR YG KLSRHLMKVKL 
KKYTKFLFVRDPFVRLIS AFRSKFELENEEF/ * PQVRRAHAAAV 
RQPHQ PARLGARGLPRWPQ\ VS FANFIQ YLLDPHT3KLAP FNEH 
WRQVYRLCHPCQIDYDFVGKLETLDEDAAQLLQLLQVDLAAPLP 
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ID 
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to first 
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residue of 
amino acid 
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Predicted end 
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to first 
amino acid 
residue of 
amino acid 
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Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methxonine, N=*Asparagine, 
P=Proline, Q*=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W<rryptophan, Y«Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








PELPGTGPPSSWEBDWFAKIPLAWRQQLVJCLYSADFVLPGypKP'"" 
ENLLRD 


5383 


45 


5250 


VERLLGCRNS KRTWRMLIS KNMP WRRLQG rSFGMYSAEELECKLS 

VKSITNPRYLDSLGNPSANGLYDLALGPADSKBVCSTCVQDFSN 

CS GHLGHI ELPLTVYNPLL FDKL YLLLRGS CLNTCHMIiT CPRAVI 

HLLLCQLRVLEVGALQAVyELERILSRFLEENADPSASKIREEL 

EQYTTEI VQNNLLGS QGAHVKNVCES KS KL I AL FWKAHMNAKRC 

PHCKTGRSWRKEHNSKLTITFPAMVHRTAGQKDSEPLGIEEAQ 

IGKRGVLTPTSAREHLSAIJWKNEGFFljNYLFSGMDDDGMESRFN 

PS VFFLDFIj WPPSRS R P VSRLGDQMFTNGQTVNLQAVMKD WL 

IRKLLALMAQEQKLPEEVATPTTDEEKDSIiIAIDRSFLSTLPGQ 

SLIDKLYNIWIRLQSHVNrVFDSEMDKLMMDKYPGIRQILEKKE 

GLFRKHMMGKRVDYAARSV1CPDMY INTNE IGI PMVFATKLTYP 

QPVTPWITVQELRQAVINGPNVHPGASMVINEDGSRTALSAVDMT 

QREAVAKQLLTPATGAPKPQGTKIVCRHVKNGDILLLNRQPTLH 

RPSIQAHRARILPEEKVLRLHYANCKAYNADFDGDEMNAHFPQS 

ELGRAEAYVIiACTDQQ YLVPKDGQPLAGL I QDHM VSG AS MTTRG 

CFFTREHYMELVYRGLTDKVGRVKLLSPSILKPFPLWTGKQWS 

TLLINI I PEDHIPLNLSGKAKITGKAWVKETPRS VPGFNPDSMC 

ESQVI IREGEIiLCGVLDKAHYGS S AYGLVHCCYE I YGGETSGKV 

LTCIARLFTAYLQLYRG FTL3VED I LVKPKADVKRQR 1 1 E ESTH 

CGPQAVRAALNLPEAASYD3VRGKWQDAHLGKDQRDFNMI DLKF 

KEEVHHYSNEINKACMPFGLHRQFPENTLQLMVQSGAKGSTVNT 

MQISCTiIjGQIELEGRSTPLMASGKSLPCFEPYEFTPRAGGFVTG 

RFLTG I KPPEFFFHCMAGREGLVDTAVKTSRSGYLQRCI I KHLE 

GLWQYDLTVRDSDGSWQFLYGEDGLDIPKTQFLQPKQFPFLA 

SNYEVIMKSQHLHEVtiSRAJjPKKALHHFRAIKKHQSKHPNTLItR 

RGAFLS YSQKIQEAVKALKLESENRNGR/RPWDS /G/RMLRMWY 

ELDEES RRKYQKKAAACPDPSUSVWRPDIYFASVS ETFETKVDD 

YSQEWAAQTEKSYEK5ELSLDRLRTLLQL\KWQRSLCEPGEAVG 

IiIAAQS I GEP STQMTLNTFHFAGRGEMNVTLG I PRLRE I LMVAS 

ANIKTPMMSVPVLNTKKALKRVKSLKKQIiTRVCLGEVLQKIDVQ 

ESFCMEEKQNKFQVYQLRF0FLE»HAYYQQEKCLRPEDILRFMET 

RFFKLLMESIKKK^KASAFRNWTORATQRDLDNAGELGRSRG 

EQEGDEEEEGHIVDAEAEEGDADASDAKRKBKQEEEVDYESEEE 

EEREGEENDDEDMQEERNPHREGARKTQEQDEEVGL/GH*GGPV 

PSRPPDAAPETHPQPGAPGA\EAMERRVQAVREIHPFIDDYQYD 

TEESLWCQVTVKLPLMKINFDMSSLVVSriAHGAVIYATKGITRC 

LLNETTIWKNEKELVLNTEGI1JLPELFKYAEVLDLRRLYSNDIH 

AI ANTYGI EAALRV IEKEI KDVFAVYGIAVDPRHLSLVADYMCF 

EGVYKPLNRFGIRSNSSPLQQMTFETSFQFLKQATMLGSHDELR 

S PSACL WGKWRGGTGLFELKQPLR 


5384 


i?b 


886 


QSCGQRLPT VL* L*GPPGS CPCIIiSLF \PGRPHALPEIRPYINI 
TIXjKGDKGDPGPMGLPGYMGREGPQGBPGPQGSKGDKGBMGSPG 
AP CQKRFFAFSVGRKTALKSGEDFQTLLFERVFVNLDGC FDMAT 
GQFAAPLRG I YFFS LNVHS WIJYKETYVHIMHNQKEAVI LYAQPS 
ERSIMQSQSVMLDLAYGDRVWVRLFKRQRENAIYSNDFDTYITF 

sghlikaedd 


5385 


326 


799 


LMVPRTKKEAPAPPKAEAKAKAL\KAKKAVLKDVKSHKKNKIHM 
SPTFRRPKTL*LRRQPKYPWKSTPRRKKLDHHVIIKFPLTTE*A 
VKKIENNSljIiVFTVDVXANKHQXKOAVKK/LCDIDVAKVNTLJQ 
SDGERKAY VR LAPDYDALWATKI GIT 


53 86 ~ 


326 


799 


LMVPRTKKEAPAPPKAEAKAKAL\KAKKAVLKDVHSHKKNKIHM 
SPTFRRPKTL * LRRQ PKYP WKSTPRRNKLDHHVI I KFPLTTE* A 
VJCKIENNSLLVFTVDVKANKHQIKQAVKK/LCDIDVAKVNTLIQ 
SDGSRKAYVRLAPDYDALWATKIGIT 


5387 


2 


2117 


FWAASGGCWFVLGERRAGSLLSASYGTFAMPGMVLFGRRWA'IA 
SDDLVFPGFFELWRVLWWIGILTLYLMHRGKLDCAGGALLSSY 
L I VLMI L LAWICT VS AI MCVSMRGTI CWPGPRKSMSKLL YIRI* 
ALFF PEM VWASI^AAWVADG VQCDRTVVNG 1 I ATWVS W I 1 1 AA 
T WS III VFDPLGGKMAP YSSAG PSHLDSHDS SQLLNGIiKTAAT 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, DsAspartic Acid, E~ 
Glutamic Acid, F= Phenyl a la nine, G -Glycine, 
H=Histidine, I»Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N"=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=*Threonine, V^Valine, 
w=Tryptophan, Y^Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SVWETRIKLLCCCIGKDDHTRVAFSSTABIiFSTYPSDTDLVPSD 
IAAGLALLHQQQDNIRNNQ3PAQWCHAPGSSQEADLDAELKNC 
HH YMQ FAAAAYG WPL Y I YRNPLTGLCR I GGD C CR S KNPQTMT /M 
VGGDQLQL/ CTSAP ILHTHRAAVQGLHPRQLP WTRFTELPFLVA 
LDHRXBSVWAVRGTMSLQDVLTDLSAESEVLDVECEVQDRI*AH 
KG I SQ AARYVYQRL INDG I LSQAFS I APE YRL VI VGHS LGGG AA 
ALLATMVRAAYPQVRC YAPS PPRGLWSKALQE YSQSFI VS LVLG 
KDVI PRLSVTNLEDLKRR I LRWAHCNKPKYKI LLHGLW YEL FG 
GNPNNL PT ELDGGDQEVLTQ PLLGE QS LLTRWS PAYS FSS DS PL 
DSSPKYPPLYPPGRIIHLQEEGASGRFGCCSAAHYSAKWSHEAE 
FSKILI GPKMLTDHMPDI LMRALDS VVSDRAACVS CPAQGVS SV 
DVA 


5388 


1569 


753 


TADGGAGGGGRRQAGVRRHYLYPFTGGYRRRRAACQAERPAARS 
KDTDLAAYQKGNLGVQLRNMAQE TNHSQVPML CS TGCGFYGN PR 
TNGMCSVCYKEHLQRQNSSNGRISPPVQCTDGSVPEAQSALDST 

DTAQQPSEEQSKSLE\NRNKKRIAVSCAGRKWDLLGLNAGVEMF 
TWYTVTQMYTIAIjTITKQMLKNFVFQQEFKSFGSFHQQLLEYK 
ILEHLQTKN 


53B9 


1569 


753 


TADGGAGGGGRROAGVRRHYLYPFTGGYRRRRAACQAERPAARS 
KDTDLAAYQKGNLGVQLRNMAQETNHSQVPMLCSTGCGFYGNPR 
TNGMCSVCYKEHLQRQNSSWGRISPPVOCTDGSVPEAQSAIjDST 
SSSMQPSPVSNQSLLSESVASSQLDSTSVDKAVPETEDVQASVS 
DTAQQPSEEQSKSLE\NRNKKRIAVSCA£3RKWDLLGIjNAGVEMF 
tvvytvtq>1ytialtitkqmlkltffvfqqefksfgsfhqqlleyk 
ILEHLQTKN 


5390 


217 


1332 


EDPR KLME DKMWS ECEGP EMS LVCLTD FQAHAR EQLS KSTRDF I 
EGGADDSITRDDNIAAFKRIRLRPRYLRDVSEVDTRTTIQGEEI 
SAPICIAPTGFHCLVWPDGEMSTARAAQAA\GICYITSTFASCS 
LEDIVIAAPEGLRWTFQLYVHPDLQLKKQLXQRVESLGFKALVIT 
LDTP VCGNRRHD I RNQLRRNLTLTDLQS PKKGNAI PYPQMTP I S 
T SLCWNDLS W FQS ITRLPI I LKG I LTKEDAELAVKHNVQG I IVS 
NHGGRQLDEVLAS I DALTE WAAVKG KI EVYLDGGVRTGKDVLK 
ALALGAKCIFLGDAI LWALAS KGEHGVKE VLN I LTNE FHTS MA\ 
LTGCRS VAE INRNL VQFSRL 


5391 


1 


1292 


VKKAAGR SRGP PTAGGQR CEEAPGT VMERRLG VRAW VKENRGS ? 
Q PPVCNKLMtlQEQLKVMF VGGPNTRKD YHI EEGEEVF YQLEGDM 
VLRVLEQGKHRDWIRQGEIFLLPARVPHSPQRFANTVGLWER 
RRLETELDGLRYYVGDTMDVLFEKWFYCKDLGTQLAP IIQE FFS 
SEQYRTGKPXPDQLLKEPPFPLSTRSIMEPMSLDAWLDSHHREL 
QAGT PLS LFGDT YE TQ V IAYGQGS SEGLRQNVDVWLWQLEGS SV 
VTMGGRRLSLGPWMDSLLVLSWGPSY\AW\ERTQGSVALSVT\Q 
DPACKKS P WGEPS CHGLKAATGVPSTLEVPSLPNWS PS PHYLSV 
YCRCVPHRPAHCCHPPSCPSQPRGHAPGRAAAPHLLWQTQPTAL 
PVLPGGLPPAPLLP I PLSLQTQCSTS TPRR PS IKAS 


5392 


1 


1623 


I RGSNAQKWGAS GSGG AG PQP D PAG PGG VPALAAAVLGACE PR 
CAAPC PLPAIiSRCRGAGSRGSRGGRGAAGSGDAAAAAEW I RKGS 
F IHKPAHGWLHP DARVLGPGVS YVVR YMGCTEVLRS MRS LDFNT 
RTQVTREA INRLHEAVPGVRGS WKKKAPNKALAS VLG KSNLR FA 
GMS I £ IH I STDGLS LS VPATRQVI ANHHM PS I SFASGGDTDMTD 
YVAYVAKDPINQRACHI LECCEGL\AQSI ISXVGQAFELRFKQY 
LHSPPKVALPPERLAGPEESAWGDEEDSLEHKfYYNSIPGKEPPL 
GGLVDSRLALTQPCALTALDQGPSPSLRDACSLPWDVGSTGTAP 
PGDGYVQADARGPPDHEEHLYVNTQGLDAPEPEDSPKKDLFDMR 
P FEDALKLHECS VAAGVTAAPLPLEDQWPS P PTRRAPVAPTEEQ 
LRQEPWYHGRMSRRAAERMLRADGDFLVRDSVTNPGQYVLTGMH 
AGQPKHLLLVDPEGWRTKDVLFESISHLIDHHLQNGQPIVAAE 
S ELHLRG WSRE P 


53 93 


2 


982 


GGDSAGMTMETQMSQNVCPRNLWLLQPLTVLLLLASADSOAAAP 
PKAVLKLEPPWINVLQ\EDSVTLTCQGAPQP/ERSDSIQWFHNG 
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amino acid 
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Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G -Glycine/ 
H=Histidine, I=»Isoleucine, K=Lysine, 
L»Leucine, M=Methionine, N*=Asparagine, 

p = pY-r>1 "i n** 0=m_11t* ami Ttr* P — ZVvrr $ n i rt** 

S=Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








\NL I PTHTQPS \ yRFKANNN\DSGEYTCQTGQTSD\SDPVHIjTV 
LSEW LVLQT PH LE FQEGETI MLR CHS \ WRDKP \LVKVTFFQNGK 
SQKFSHLDPTFS I PQANHSHSGD YHCTGN I G YTLFSS KP VT I TV 
OVPSMGSS S PMG 1 1 VAWIATA VAAT VMWALT yrp Tf TCP T «? AM 
STDPVKAAQFEPPGRQMIAIPJCRQLEETOWDYETADGGYMTLNP 
RAPTDDDKNIYLTLPPNDHVNSNW 


5394 


2 


982 


GGDSAGMTM ETQ MSQNVCPRNLWLLQPLTVLIiLLASADSQAAAP 
PKAVLKLEPPW INVLQ\EDSVTLTCQGAPQP / ERSDS IQWFHNG 
\NLI PTHTQPS \YRFKANNN\DSGEYTCQTGQTSL\SDPVHLTV 
LSEWLVLQTPHLEFQEGETIMLRCHS \ WRDKP \ LVKVT F FQNGK 
SQKFSHLDPTFS IPQANHSHSGDYHCTGNIGYTLFSSKPVTI TV 
QVPSMGSSSPMGIIVAWIATAVAAIVAAWALIYCRKKRISAN 
STDPVKAAQFEP PGRQMI AI RKRQLEETNNDYETADGGYMTLNP 
RAPTDDDKNI YLTLPPNDHVNSNN 


$395 


3135 


531 


RASDAKNQEGLLNTRRKSTDS VP I SKSTLSRSLSLQASDFDGAS 
S SGNPEAVAIAPDAYSTGSS SASSTLKRTKKPRP PSLKKKQTTK 
KPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 
RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 
SWDNQQENPP PTKKIGKKPVAKMPLRRPKMKKTPEKLDNTPAS P 
PRSPAEPNDI PIAKGTYTFD I DKWDDPNFNPFSSTSKMQBSPKL 
PQQSYNFDPDTCDESVDPPXTSSKTPSSPSKSPASFEIPASAME 
ANGVDGDGUSJKPAKKKKTPLXTDT FRVKKSPKRS PLSDPPSQDP 
TPAATPETPP VI SAVVKATDE EKI^VTKQKWTCMTVDLEADKQD 
VPQPSDLiSTFVNE TKFSS PTEE IjDYRNS YE I E YME KIGS SLPQD 
DDAPKKQALYLMFDTSQES PVKSSPVRMSESPTPCSGS S FEETE 
ALVNTAAKNQH PVPRGLAPNQESHLQVPEKSSQKE LEAMGLGTP 
SEAI E I TAPEGSFASADALLSRLAHPVSLCGALD yi*epdlaekn 
PPLFAQ KLQREAAH PTDVS I SKTAL YSR I GTAEVE KPAGLLFQQ 
PDLDSALQIARAEI I TKEREVSEWKDKY EESRREVMEMRKI VAE 
YEKT1AQM I EDEQREKS VS \HQTVQQIiVI*E KEQa\ LADLNS VE K 

\ sriadiifrryekmkevlegfrkneevlkrcaqe ylsrvkkeeqr 
yqalkvha\eekldranae\iaqvrgkaqoeqaahqaslaerss 
crv\dalertleqknkeieeltkicdeliakmgks 


5396 


3135 


531 


RASDAKNQEGLLKTRRKS TDS VP TSK3 TLSRSLS LQAS DFDGAS 
SSGNPEAVALAPDAYSTGSSSASSTLKRTKKPRPPSLKKKQTTK 

kptetppvtcetqqepdeeslvpsgeniasetktesaktegpspa 
lileetplepaagpkaacpldsesvegwppasgggrvqksppvg 

RKTIiPLTTAPEAGEVTPSJDSGGQBDSPAKGHSVRLEFDYSEDKS 

swdnqqenppptkkigkkpvakmplrrpkmkktpekldntpas? 

PRSPAEPWDIP IAKGTYTFDI DKWDDPNFKPFSSTSKMQESPKL 
PQQS YNFD PDTCDES VDPFKTSSKTPS SPSKSPASFE IPASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQD? 
TP AATPETPPV I S AWHATDEE KLAVTNQKWTCMTVDLEADKQD 
YPQPSDLSTFVNETKFSSPTEELDYRNSYEIEYMEKIGSSLPQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQHPVPRGLAFNQESHLQVPEKSSQKELEAMGLGTP 
S SAIE I TAPEG S FAS ADALL S RLAH P VSLCGALD YL E PDLAE KN 
PPLFAQKLQREAAHPTDVSI SXTALYSR I GTAEVEKPAGIXFQQ 
PDLDS ALQ XARAE I ITKEREVSEWKDKYEESRREVMEMRKIVAE 
YEKTI AQM I EDEQREKSVS\HQTVQQLVLEKEQA\LADLNS VEK 
\SLADLFPJR YEKMKE VLEGFRKNEE VLKRCAQE YIiSRVKKEEQR 
YQALKVHA\ESKLDRANAE\ IAQVRGKAQQEQAAHQASLAERSS 
CRV\nALERTLEQKNKE IEELTKI CDELIAKMGKS 


5397 


3135 


531 


RASDAKNQEGIjIiNTRRKSTDSVPISKSTLSRSLSLQASDFDGAS 
SSGNPEAVALAPDAYSTGeSSASSTLKRTKKPRPPSLJGCKQTTK 
BCPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLEET PLEPAAGPKAACPLDS ESVEGWPPASGGGRVQNSPPVG 
RKTLPLTTAPEAGEVTPS DSGGQE DS PAKGHS VRLEFDYSEDKS 
SWDNQQENPP PTKK I GKKPVAKMPIiRR PKM KKTPEKLDNTFAS P 
PRSPAEPOTIPXAKGTYTFDIDKWDDPNFNPFSSTSKMQESPKL 
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Predicted end 
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Amino acid segment containing signal peptide 
(A-Alanine, C«cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine , 
PsProline, Q=Glutamine , R=Arginine, 
S=Se;rine, TwThreonine, V-Valine, 
W=*Trypbophan, Y^Tyrosine, X=Unknown, *s=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








PQQSYNFDPDTCDES VDPFKTSS KTPS S PSKSPAS FEIPASAME 
ANGVDGDGLNKPAKKKKTPLKTDTPRVKKSPKRSPLSDPPSQDP 
TPAATPETPPV I SAWHATDE E KLAVTNQKWTCMT VDIiEADKQD 
YPQPS DLS TFVNETKFSS PTEELD YRNS YEIE YMEK IGS SL PQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNUHPVPRGLAPNQESHLQVPEKSSQKELEAMGIiGTP 
SEAI E ITAPEGS FAS ADAI»LS RIAHP VS I*CGALDYLEPDIAEKN 
P PLFAQKtiQ REAAHPTDVS I S KTALYSR I GTAEVE KPAGLIj FQQ 
PDLDSALQXARAEIITKEREVSEWKDKYEESRREVMEMRKIVAE 
YEKT IAQM I EDEQREKSVS \HQTVQQLVLEKEQA\LADLNSVEK 
XSLADLFRRYEKMKEVIiEGFRKNEEVLKRCAQEYLSRVKBCEEQR 
YQALKVHA\EEKLDRANAE\IAQVRGKAQQEQAAHQASLAERSS 
CRV\DALERTLEQKNKEIEELTKICDEliIAKMGKS 


5398 


56 


5426 


SGEVCRMESNFNQEGVPRPSYVFSADPIARPSEINFDGIKLDLS 
HEFSLVAPNTEANS FESKDYLQ VCLR IR PFTQS EKELESEGCVH 
ILDSQTWLKEPQCILGRXSEKSSG\QM\AQKFSFFPGFLGPAT 
TOKEFFQGCIMHP\VKDLLKGQSRLIFTYGLTNSGKTYTFQGTE 
EN I RILPRTLNVIi FDS LQERL YTKMNLKPHRSRE YLRLS S EQE K 
EE I ASKSALLRQI KEVTVHNDSDDTLYGS LTNSLNISEFEES I K 
DYEOANLNMANS IKFSVWVS FFEI YNEYIYDLFVPVSSKFQKRK 
MLRLSQDVKGYSFIKDLQWIQVSDSKEAYRLLKLGIKHQSVAFT 
KLNNASS RSHS I FTVKI LQ I EDSEMSR V IRVSELSLCDIAGSER 
TMKrQNEGERLRETGNINTSLLTLGKCINVLKNSEKSKFQQHVP 
FRESKLTHYF/QSFFNGKGKICMIVNISQCYLAYDETLNVLKFS 
AI AQKVC VPDTLNS S QEKLFG PVKSSQDVSLDSNS NSK ILNVKR 
ATISWENSLEDLMEDEDLVEELENAEETED/VGETKLLDEDLDK 
TLEEKKAF I SHEEKRKLLDL I EDLKKKli INEKKEKLTtiEFK I RE 
EVTQEFTQYWAQREADFKETLL0ERE1 LEENAERRLAI FKDfcVG 
KCDTRE EAAKDICATKVE TEEATACLELKFNQ I KAE LAKTKGEL 
IKTKEELKKRENESDSLIQELETSNKKIITQNQRIKELINIIDQ 
KEDTINEFQNLKSHMENTFKCNDKADTSSLIINNKLICNETVEV 
PKDSKSKICS3RKRVNENELQQDEPPAKKGSIHVSSAITEDQKK 
SEEVRPKIAE IED IRVLQENNEGLRAFLLTIEWELKKEKEEKAE 
LNKQIVHFQQ3LSI.SEKKNLTLSKEVQQIQSNYDIAIAEIjHVQK 
SKNQEQEEKIMKLSNE IETATRS ITNNVSQIKMHTKlDEIiRTL 
DS VS QI SNI DLLNLRDLS JJGS EEDNLPNTQ LDLLGiTO YL VSKQ V 
KEYRIQE PNRENS FHS S I EA I WEECKE I VKAS S KKS HQ IEELEQ 
Q I E KLQAE VKG YKD ENNRLKE KEHKNQDDLLKE KETL I QQLKEE 
LQEKNVTLDVQ IQHVVEGKRALS ELTQG VTC YKAKI KBZiETILE 
TQKVERSHSAKLEQDILEKESIILKLERNLKEFQEELQDSVKNT 
KDLNVKELKLKEBITQLTNNLQDMKHtiLQLKEEEEETNRQETEK 
LKEEIiS AS S ARTQN\LNADLQRKE ED YADL KEKLTDAKKQI KQ V 
QKEVSVMRDEDKLLRIKINELEKKKNQCSQELDMKQRXTIQQLK 
EQLINQKVEEAIQQYERACKDLNVKEKIIEDMRMTLEEQEQTQV 
EQDQVh \ EAKLS E VE RLATELDR WRVKCNDL E T KNNQR S WKEHE 
NNTDVLGKLTMLQDELQESEQKYNADRKKWLEE KMML ITQAKEA 
ENIRNKEMKKYABDRERFFKQQNEMEILTAQLTEKDSDLQKWRE 
ERDQLVAALEIQLKALISSNVQKDNErEQLKRIISETSKIETQI 
MDIKPKRI SSADPDKLQTE PLSTS FE I SRNKI EDGS WLDSCEV 
STENDQSTRFPKPELEIQFTPLQPMTKMAVKHPGCTTPVTVKIPK 
ARKRKSNEMEEDLVKCENKKNATPRTKrLKFPXSDDRNSSVJOGEQ 
KVAIRPSSKKYYSLRSQAS I IGVNLATKKKEGTLQKFGDFLQHS 
PS ILQSKAKKI IETMSS S KL SNVEAS KENVS QP KRAKRKL Y TS E 
ISSPIDISGQVILMDQKMKESDHQIIKRRLRTKTAK 


5399 


70S 


230 


GPRMAKFLSQDQINEYKECFSLYDKQQRGKIKATDLMVAMRCLG 
AS P TPGEVQRHLQTHGI DGNGELDFSTFLT I MHMQI KQEDPKKE 
ILLAMLMVDKEKKG YVMASDLRS KLTS LGEKI/THICEV \ DDL FRE 
\ AD I E PNfG KVK YDEF IHKI TS YLDGTY 


5400 


931 


248 


SHCSSGME I PPTNYPASRAALVAQNYINYQQGTPHRVFEVQKVK 
QASMEDIPGRGHKYRI/KFAVEEIIOKQVKVNCTASVLYPSTGQE 
TAPEVKFTFEGETGKNPDEEDNT F YQRLKSMKEPIiEAQN I \PDN 
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beginning 
nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= cysteine r D=Aspartic Acid, E= 
Glutamic Acid, F* Phenyl alanine, G=Glycine, 
H*=Histid±ne, I«Isoleucine, K^Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P= Prol i ne , Q=G 1 u tami ne , R*Arg inine , 
SsSerine, T^Threonine, V=Valine, 
W^Tryptophan, Y=* Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








FGKTVS PEMTLVLHLAWVACG Y I 1 WQNSTEDTWYKMVKIQTVKQV 
QRNDD F I ELDY T I L LHNI AS QE 1 1 PWQMQ VLWHPQ YGTKVKHNS 
RLPKEVQLE 


5401 


3 


1360 


TGWSYGPTTSIAFIAPRDFPFPPKLLIHPQAVVRLSCGAGSMGS 
QAAAEWRNWASWEGSSSLSGCSMGCFKDDRIVFWTWMFSTYFME 
KWAPRQDDM LFYVRRKLAYS G SESGADGRKAAEPE VEVE VYRRD 
SKKLPGLGDPDIDWEESVCIiNLILQKLDYMVTCAVCTRADGGDI 
HIHECKKSQQVFASPSKHPMDSKGEESKISYPNIFFMIDSF\BE\ 
VFSDMTVGKGEMVCVELVASDKTNTFQGVrFQGSIRYEALKKVY 
DNRVSVAARMAQK\MSFGFSKYSNMEF\VR\MKGPQGKGHAEMA 
VSRVSTGDTSPCGTEEDSSPASPMHERVTSFSTPPTPERNNRPA 
FFSPSLKRKVPRNRIAEMKKSHSANDSEEFFREDDGGADLHKAT 
NLRSRSLSGTGRSLVGSWLKLNRADGNFLLYAHt.TYVTLPIiHRI 
LTDILEVRQKPILMT 


5402 


3445 


1563 


GE C F1MAAVVQQNDLVFE FASWVME DERQLGDPAI FPAV I VEH V 
PGADILNS YAGLAC VEEP2TOMITESSLDVAEEE I IBDDDDDI TL 
TVEASCHDGDET1 ET I EAABALLNMDSPGPMLDEKR1NMNIFS S 
PEDDMWAPVTHVSVTLDGIPEVMETQQVQEKYADSPGASSPBQ 

ALLQDKATCPKYIKWTQREKGIFKLVDSKPVSRI>WRKHKNKP\D 
MNYE PMGRALRYYYQRG I LAKVEGQRLVYQFKEMPKDL IYINDE 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNSKAAKPKDPVEVAQPSEVLRTVQPTQSPYPTQLFRTVHWQ 
PVQAVPEGEAARTSTMQDETLNSSVQSIR\TIQAPTQVPVWSP 
RNQQ\ LHTVTLQrVPLTTVI AS TDPSAGTGSQKFILOAI PS SQP 
MTVLKENVMLQSQKAGS PPS I VLGPAR V\QQVX.TSNVQTICNGT 
VSV\ASSPSFS\ATAPWT3jFLLGSSQLVAHPPGTVITSVIKTQ 
ETKTLTQE VEKKESEDHLKENT E KTSQQPQ P YVMWS S SNG FTS 
QVAMKQNELLE PNS F 


5403 


3445 


1563 


GECF I MAAWQQNDLVFEFASNVME DERQLGDPAI FPAVI VEHV 
PGAD I LNS YAGLACVEE PNDMI TES SLDVAEEE 1 1 DDDDDD I TL 
TVEASCHDGDETIETIEAAEAliLNMDSPGPMLDEKRIiJi?NI FSS 
PEDDM WAPVTHVSVTLDG I PE VWETQQ VQEKYAD S PGAS S P EQ 

PTfRK'VrtPTfTTfPPT?P'n^ , Da , T w PDKrTC! , \nf'K' MKTWUfrtWTT vt.wtt wr T 
e (u\uI\UIu\, J, XvJET * *\trU& JTX\A i l iilu V A-Tv f\_L^J IVLjVj Ivv?a\ 1X1 JJrY Est? 1 '» ' 

ALLQDKATCPKY I KWTQREKG r FKLVDS KPVSRLWRKHKNKP\D 
MfcWEPMGRALRYYYQRGIIiAKVEGQRLVYQFKEMPXDLIYINDE 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNSKAAKPKDPVEVAQPSEVLRTVQPTQSPYPTQLFRTVHWQ 
PVQAVPEGEAARTSTMQDETLNSSVQS IR\TIOAPTQVPWVS P 
RKQQ\LHTVTLQTVPLTTVIASTDPSAGTGSQKFILQAIPSSQP 
MTVLKEMVMLQSQKAGSPPSIVLGPARVXQQVXiTSNVQTICWGT 
VSV\ASSPSFS \ATAPVVTLFLLGSSQLVAHPPGTVITSVI KTQ 
ETKTLTQ EVEKKES EDHLKENTEKTEQQPQPYVMWSSSNGFTS 
QVAMKQNELLE PN S F 


5404 


187 


1111 


LPVTLI FAKMKTLQSTLLLLLLVPLIKPAPPTQQDSRI IYDYGT 
DNFEESI FSQDYEDKYLDGKNI KEKETVI IPNEKSLQLQKDEAI 
TPLP P KKENDEM PT CLLCVCLSGSVYC EE VDI DAVPPL PKE S AY 
I*YARFNKIKKLT\AKDFADIPNLRRLDFTGNIiIEDIEDGTFSKL 
SLVEELSLAENQLLKLP VLPPKLTLFNAKYNKIKSRGI KAWAFK 
KLNNLTFLYLDHNALES VPLNLPESLRVIHLQFNNI AS I TDDTF 
CKANDTSYIRDR1EEIRLEGNPIVLGKHPNSFICLKRLPIGSYF 


5405 


2199 


1220 


QNSRSLHMDPQNQHGSGSSLWIQQPSLDSRPRLDYEREIQPTA 
ILSLDQIKAI RGSNE YTEGPS WKRPAPRTAPRQEKHERTHE 1 1 
PINVNNNYEHRHTSHLGHAVLPSKARGPILSRSTSTGSAASSGS 
WSSASSEQGLLGRSPPTRPVPGHRSERAIRTQPKQLIVDDLKGS 
L KEDLTQHK F I CE QCGKCKCGE CTAPRTL PS CLACN RQCLCSAE 
SMVEYGTCMCL\VKGIFYHCSNDDEGDSYSDMPCSCSQSHCCSR 
Y LCMGAMS LFLPC LLC YP PAXGCLKLCRRC YDW IHR PGCRCKNS 
NTVYCKLESCPSRGQGKPS 


5406 


279 


2732 | EWRTYIWEGPIjTFMDVAIEFCLEEWQCLDTAQQNLYRIWHLBNY 1 
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Predicted end 
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corresponding 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K^Lysine, 
L=Leucine, M«=Methionine, N=Asparagine , 
P» Proline, QteGlutamine, R»Arginine, 
S=Serine, "^Threonine, V=Valine, 
W= Tryptophan, Y= Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RNX»VFLG/ 1 IAVSKPDLITCXEQEKEPWEPMRRHEMVAKPPVMC 
SHFTQDFWPBQHIKDPFQKATLRRyKNCEHKNVHUCKDHKSVDE 
CKVHRGGYNGFNQCLPATQSKIFLFD.KCVKAFHKFSNSNRHKIS 
HTEKKLFKCKECGKSPCMLSHLAQHKriHTRVWFCKCEKCGKAF 
NCPS I ITKHKRINTGEKPYTCEECGKVFNWSSRLTTHKKNYTRY 
KLY KCEECGKAFNKSS I LTTHKI I RTGEKFYKCKECAKAFNQSS 
Nr.TEHKKIHPGEKPYKCEECGKAFNlNTPSTLTKHIGlIHTGEKPYT 

TKHKE IHTE KKP Y KCEE CG KAF KWS S KLTEH KLTHTGEKP Y KCE 
KCGKAFNCPSIITKHNRINTGEKPYTCEECGKVFNWSSRLTTHK 
KNYTRYKLYKCEECGKAFNKSSILTTHKKIH I EKKFYKCEECGK 
AFKWSSKLTEHKITHTGEKPYKCEECGKAFNHFSILTKHKRIHT 

SSNLTTHKKIHTGGKPYKCEECGKAFWQFSTLTJCHKIIHTEEKP 
YKCEE CGKAFKWS STLT KHKI I HTGEKP YKCEECG\ KAFKLS ST 
LSTHKIXHTGEKPYKCEKCGKAFNRPSNLIEHKKIHTGEQPYKC 
EECGKAFNYS SHLNTHKR IHTKEQP YKCKECGKAFNQYSNLTTH 
NKIHTGEKLYKPEDVTVILTTPQTFSNIK 


S407 


3 


659 


RPRRRQS SCCTG WLAG WIiIiRAAPR FCR RTETDMEQGKGLAVL I L 
All LLQGTLAQS I KGNHLVKVYD YQE DGS VLI/TCDAEAKN I TWF 
KDGKMIGFLTEDKKKWNLGSNAKDPRGMYQCKGSQNKSKPLQVY 
YRM CQN CIE LNAATI SGFL FAE I VS I FDLAVGVYFIAGTGMEFR 
QS \RASDKQTLLP \NDPAPTQPLKDPRKMTQYSHLQGN\QLRRN 


5408 


2745 


6128 


QGS KGTCHPQAQQ P WDEGVW QEAPSQ S EPWGQS QE P PTMPQRLiP 
HARQHTPJjPLGSADYRRWSVRPQGPHRDPKDSRDAAKREQGSL 
APRPVPASRGGKTLCKGYRQAPPGPPAQFQRPICSASPPWASRF 
STPCPGGAVREDTYPVGTQGVPSLALAQGGPQGSWRFIiEWKSKP 
RLPTDLDIGG P WFPH YD FERSCWVRAISQEDQLATCWQAEHCGB 
VRNKDMSWPEEMS FI ANSS KIDRHKVPTEKGATGLSNLGNTCFM 
NSS I QCVSNTQPLTQYFISGRHLYELNRTNPIGMKGHMAKCYGD 
LVQELWSGTQKNVAPLKLR WT I AKYAPRFNGPQQQDSQELLAF L 
LDGLHEDLNRVHEKPYVELKDSDGRPDWEVAAEAVJDNHIjRRNRS 
XWDLFHGQliRSQVKCKTCGHISVRFDPFNFIiSLPLPMDSYMHL 
EITVIKLDGTTPVRYGLRMJMDEKYTGLKKQIiSDLCGLNSEQIL 
IiAEVHGSNlKNFPQDNQKVRLSVSGFLCAFEI PVPVSPISASS P 
TQTDFS S S PSTNEMF TLTTNGDLPR P IF I PNGMPN? WPCGTEX 
NFTNGMVNGHMP SLPDSPFTGYI 1AVHRKMMRTELYFLSSQKNR 
PSLFGMPIilVPCTVHTRKKDLYDAVWIQVSRLASPItPPQEASNH 

DRAFIGNAYIAVDWHPTALHIiRYQXSQERWDEHESVEQSRRAQ 
VEPINLDS CLRAFTS EEELGENEMYYCSKCKTHCLATKK£jDI*WR 
LPPILIIHLKRFQFVNGRMIK^QKIVKFPRESFDPSAFLVPRDP 
ALCGHKPLTPQGDEI>SEPRILAREVKKVDAQSSAGEEDVLLSKS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKMSSPKSSPRTLGRS 
KGRLRLPQ X GS KNKLS S S KENLDAS KEKG AGQ ICEIADALSRGH 
VLGGSQPELVTPQDHEVALANGFLYEHEACGNGCGNGYSNGQIiG 
NHSEEDSTDDQREDTRIKPIYNLYAISCHSGILGGGHYVTYAKN 
PKCKWYCYNDS SCKELHPDEIDTDSAYI LFYEQQGI DYAQFLPK 
TDGKKMADTSSMDEDFESDY\EKYCVLQ 


5409 


2745 


612B 


qgskgtchpqaqqpwdegVkqeApsqsepwgqsqepptmpqrlp " 
harqhtplplgsadyrrwsvrpqgphrdpkdsrdaakreqgsl 
aprpvpasrggktlckgyrqappgppaqfqrpi csas ppwasrf 
stpcpggavredtypvgtqgvpsiialaqggpqgswrfliewksmp 

RLPTDLDIGGPWFPHYDFERSCWVRAISQEDQLATCWQAEHCGE 
VRNKDMS WPEEMSF IANS SKI DRHKVPTEKGATGLSNLGNTCFM 
NSS IQCVSNTQPLTQYFI SGRHLYELNRTNP IGMKGHMAKCYGr> 
liVQELWSGTQKNVAPLKLRWTIAKYAPRFNGFQQQDSQELIAFI, 

ldgl:^dlnrvhekpyvelkdsdgrpdwevaaeawdnhlrrnrs 
IWDIjFHGQLRSQVKCKTCGHI svrfdp fnflslplpmds ymhi* 
eitvikldgtrpvrygiirixnmpekytglkkqlsdlcglnseqil 
iaevhgsniknfpqdkqkvrlsvsgflcafeipvpvspisassp 
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to first 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A- Ala nine , C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G«*Glycine, 
H=Histidine, I«Isoleucine, K= Lysine, 
L=Leucine, M»Methionine, N»Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T -Threonine, V=V<iline, 
w=Tryptophan, Y=Tyrosine, X=Un3cnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TQTDFSSSPSTNEMFTLTTNGDLPRP I FIPNGMPNTWPCGTEK 
NFTNGMVNGHM PS LPDS P FTG YI I AVHRKMMRTELYFLS 3QKNR 
PSLFGMPIiIVPCTVHTRKKDIiYDAVWIQVSRIiASPLPPQEASNH 
flOTinnTi QMnvc^VDTTTT J^\^\7nI^v^wcr^aOT^*DtJV'Dl^/" , D^'' , ^•v■Tr^^•^**'e , 

tvJLi^ r lJiJsi y l\sX\it.tre X i-iK V V^JUJuJMO WilNLrn XltT ^Kul*n.J.lJl.ljx£< 

DRAF I GNAY I AVDWHPTALHbR YQTSQ ERVV&E HE S VE QS RRAQ 
VE P I KLDS CLRAFTS EEELGENEM YYC S KCKTHCIiATKKLDLWR 
LPPILI IHLKRFQFVNGRW I KSQKI VKFPRES FDP SAFLVPRDP 
ALCQH KPLTPQGDELSE PR I LAREVKKVDAQS S AGEEDVLL S KS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRS 

VLGGSQ PELVTPQDHE VALANG FL YEHEACGNGCGNG YSNGQLG 
NHSB EDSTDDQRE DTK I KP I YNL YA I S CHSG ILGGGH Y VT Y A KN 
PNCKWYCYNDS SCKELHPDE I DTDSAYILFYEQQG IDYAQFLPK 
TDGKKMADTS S MDEDFE S DY \ EK YCVLQ 


5410 


2 


710 


IJ^PGQARHWLAARMQAPHKEHLYKLI*VIGDI*GVGKTS I IKRY 
VHQNFSSH YRATI G VDFAJjKVLHWDP ET WRLQIiWDIAGQERFG 
NMTRVYYREAMGAF I VFDVTRP ATFE AVAKWKNDLDSKLS L PNG 
J\fc , vb V VJjJUiWKCUyijiUTvjjMNNoi CKESHGr VCaWrETSAK 
ENIK I DEASRCLVKHILANE CDLMES I E PDWKPHLTSTKVAS C 
SG\ CAKI LVGTFAGVW 


5411 


1302 


289 


TGPAAAGRRKALGS FGKPS PVTGLRAARRRRTRPSAPAAPS VGC 
GKRRESDAGAGGERASVRTGSGRRGGRTMAGDSEQTLQNHQQPN 
GGEFFLIGVSGGTASGKSSVCAKIVQLLGQNEVDYRQKQWILS 
QDSFYRVLTSEQICAKALKGQFNFDEPDAFDNELILKrLKEITEG 
KT VQI PVYDFVSHSRKEET VTVYPAD VVIiFEGI IiAF YSQER / 1 R 
DLFQMKLFVDTDADTRLSRRVIiKDISERGRDLEQILSSSTLRFV 
KPA\ FEE FCLPPK\ KYADVI I PR\GADN\RVP INL I VQHIQ\D I 
LNGGPS\NRQTKGCLNGYTPSRKRQASESSSRPH 


5412 


3180 


313 


QGISNFFHKEANFWFEVSGYIiISPLRSPFVDPAr*EWSLMASPW^ 
KMEGESSRFEIHTPVSDKKKKKCSIHKERPQKHSHEIFRDSSliV 
MEQSQ ITRRKKRKKDFQHL ISSPXiKKSR I CDETAWATSTLKKRK 
KRRYS ALE VDE EJAGVT WIiVDKENIKNT P KHFRKDVDWCVDMS 
IBQKLPRK\PKTDKFQVIAKSH\AHKSEALHSKVREKKNKKHQR 
KAASWESQRA\RDTLPOSEFPTQEESWI^SVGPGGBITEI*P\ASA 
HKNKSKKKKKKSSNREYETXliAMPEGSQAGREAGTDMQESQPTV 
GLDDETPQLLGPTHKKKSKKKKKKKSNHQEFESLAMPEGSQVGS 
EVGADMQES \R PAVGLHGE TAG t PAPAYKNKS KKKKKKSNHQEF 
EAVAMPES LESAYPEGSQVGS E VGTVEGS TAIiKGFKESNSTKKK 
SKKRKLTSVKRARVSGDDFSVPSKNSESTLFDSVEGDGAMMEEG 

LSADSGDADD5PADLGS AVKQLQEF I PN I KDRATS T I KRM YRDD 
LERFKEF KAQGVAI KFGKFS VKENKQLE KNVEDFLALTG IESAD 
KLL YTDR YPEEKS VI TWLKRRYS FRLHIG \RNIARPWJKLI YYRA 
KKMFDVNWYKGRYSEGDTEKLKMYHSLIiGNDWKTIGEMVARRSL 
S VALKFSQ I S S QRNRGAWS KS E TRKL I KAVEE VI LKKMS PQELK 
EVDSKLQENPESCIiSrVREKLYKGISWVEVEAKVQTRNWMQCKS 
KWTEILTKRMTNGRR I Y YGMNAL RAKVSL I ERI* YE INVEDTNE I 
DWEDLASAIGDVPPSYVQTKFSRLKAVYVPFWQKKTFPEIIDYL 
YETTLPLLKE KLEKMMEKKGTKIQTPAAPKQVFPFRD I FYYEDD 
SEGGGHRKRKRRPRRHAWFTP VI PVLWEAKAG WI I 


5413 " 


3 753 


1304 


RFPAGVAPRRAMANVSKKVSWSGRDRDDEEAAPIjLRRTARPGGG 
TPrjLNGAGPGAARQSPRSALFRVGHMSSVKLDDELLEPXDMDPP 
HPFPKEIPHNEKLIjSIjKYESLDYDNSENQLFLiEEERRINHTAFR 
TVEIKRWVICALIGILTGLVACFIDIWENLAGLKYRVIKGNID 
KFTEKGGLS FSLLLMATLNAAF VLVGSVIVAF I EPVAAGSGIPQ 
IKCFLNGVKI PHWRLKTLVI KVSGVI LS WGGLAVGKEGPMIH 
SGSVIAAGISQGRSTSLBCRDFKIFEYLRRDTEKRDFVSAGAAAG 
VSAAFGAPVGGVLFSLEEGASFWNQFLTWRIFFASMISTFrLNF 
VLS X YHGNMWDLSS PGLINFGRFDSEKMAYTIHEIPVFIAMGW 
GGVLGAVFNALNYWLTMFRIR YIHRPCLQVI EAVLVAAVTATVA 
FVLIYSSRDCQPIjQGGSMSYPLQLFCADGEYNSMAAAFFNTPEK 
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Amino acid segment containing signal peptide 
(A=Alanine f C^Cysteine, D=rAsparfcic Acid, E= 
Glutamic Acid, F^Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K**Lysine, 
L=Leucine, M^Methionine, N*=Asparagine, 
P= Proline , Q»Glutamine , R=Arginine , 
S=*Serine, T=Threonine , V=Valine, 
W=Tryptophan, YaTyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\~possible nucleotide insertion) 








S WSLFHDPPGS YNPLTLGLFTLWFFLACWTYGLTVSAGVP I P 
SLLIGAAWGRLFGISLSYLTGAAIWADPGKYALMGAAAQLGGIV 
RKTLSLTVIMMEATSNVTYGFPIMLVLMTAKIVGDVFIEGLYDM 
HIQLQSVPFLHWEAPVTSHSLTAREVMSTPVTCLRRREKVGVIV 
DVLS DTASNHNG FP WEKADDTQPARLQGL I LRSQL I VLLKHKV 
FVERSNLGLVQRRLRLKDFRDAYPR FPP IQS IHVSQDERECTKD 
LS E FMKTPS P YT VPQE AS L PR VFKLFRALGLRHLVWDNRNQ WG 
LVTRKDLARYRLGKRGLEELSLAQT 


5414 


2130 


390 


GVASAWDRALFS PLLSPTSRVFRTS P PR C VSTETGRRDRARVP S 

QWCS VLQGKLPVSGRTSLACVRS I LLSPASS PRKVG I VGGTGAR 

AGAAPRDHGRVRHRRPS SARRMTRTTGQCIiAPRGCQGPRGTRS P 
RSPB S£TRPjfif*S IX S p& OT . to / f*P Q & T .T \na\7T .C v tm r .t .m vm no omr 

«vo JTXVtJA. i*vt\VJ\-Ort.i5Jrra\— Jj±'/ r LAOAUX VnVJjLI J.WI iJ_iLN iMUltlr i. V 

AGVLPD I EQFFNIGDS S SGLIQTVFI SSYMVLAPVFGYLGDRYN 
RKYLMCGGIAFWSLVTLGSSFIPGEH?WLl*IiLTRGLVGVGEASY 
STIAPTLIAXHiFVADQRSRMLSIFYFAIPVGSGLGYIAGSKVKD 
MAGDWHWALRVTPGLGWAVLLLFLVVREPPRGAVERHSDLPPL 
NPTS WWADLRALARNPS FVIjSSLGFTAVAFVTGSLALWAPAFLL 
RS RWLGETP PCL PGDS CSSSDSLI FGLITCLTGVLGVGLGVEI 
S RRLRHSNPRADPLVCATGL t>GSAP FIj FLSLACARGS 1 VAT Y I F 
I FIGETLLSMNWA I VAD ILLY VVI PTRRSTAEAFQI VLSHLLGD 
AGSPYLIGIilSDRLRRNWPPSFLSEFRALQFSLMLCAFVGALGG 
AAFLGTAHLH 


5415 


693 


2986 


IPPKTKLELQKH\ LTTLT \NQEQAT1 FEEVQKLRPRNEQRENEL 
IISFLRCLFEEKQKBHIHIGEMKQTSQMAAENIGSELPPSArRF 
RUDMLKNKAKRSJbTESIjES ILSRGWKARGLQBHS I S VDLDSShS 
STLSNTSKEPSVCEKEALPISESSFKLLGSSEDLSSDSESHLPE 
EPAPLS PQQAFRRRANTLSHFPIECQEPPQPARGS PGVSQRKIjM 
RYHSVSTETPHEKKDFESKANHLGDSGGTPVKTRRHSWRQQIFli 
RVATPQKACDSSSRYEDYSELGELPPRSPLEPVCE3GPFGPPPE 
EKKRTSRELRELWQKAILQQILLLRMEKENQKLQASENDLLNKR 
LKLDYEE I TPCLKEVTTVWEKMLSTPGRS KIKFDMEKMHSAVGQ 
GVP\RHHRGEIWKFLAEQFHLKHQFPSKQOPKDVPYKELLKQLT 
SQQHAIL I DLiGRTFPTHP YFSAQLGAGQLSLYN ILKAYSLLDQE 
VGYCQGLSFVAGILLiLHMSEEEAFKMIiKFLMFDMGLRKQYRPDM 
IILQIQMYQLSRLLHDYHRDLYNHLEEHEIGPSIiYAAPWFLTMF 
ASQFPLGFVARVFDMIFLQGTEVIFKVALSLtiGSHKPLILQHEN 
LETI VDFI KS TL PNLGLVQMEKTINQVFEMDI AKQLQAYE VE YH 
VLQEELIDS S PL S DNQRMDKLEKTNS SLRKQNLDLL EQLQVANG 
RI QSLEAT r EKLLSSE SKLKQAMLTLELERSALLQTVEELRRRS 
AKPSDREPBCTQPEPTGD 


5416 


27 


4074 


KSQLFCF WGGKAGDI LSGD QDKE QKDP YFVETP YG YQLDLDFL K 
YVDDIQKGNTIKRLNIQKRRKPSVPCPEPRTTSGQQGIWTSTES 
LSSSNSDDMKQCPNFLIARSQVTSTPISKPPPPLETSLPFLTIP 
ENRQLPPPS PQLPKHNLHVTKTLMETRRRLEQERATMQMTPGE F 
RRPRLASFGGMGTTS S LPS F VGSGNHNPAKHQLQNG YQGNGD YG 
SYAPAAPTTSSMGSSIRHSPLSSGISTPVTNVSPMHLQHIREQM 
AI ALKRLKELEEQVRTI PVLQVKI S VLQEEXRQLVSQLKNQRAA 
SQINVCGVRKRSYSAGNASQLEQLSRARRSGGELYIDYEEEEME 
TVEQSTQRIKEFRQL\TADMQALEQKIQDSSCEASSELRENGEC 
RSVAVGAEEWMNDIWYHRGSRSCKDAAVGTLVEMRNCGVSVTE 
AMLGVMTEADKEIELQQQTIESLKEKIYRLEVQLRETTHDREMT 
KLKQEIX}AAGSRKKVDKATMAQPLVFSKVVEAVVQTRI)QMVGSH 
MDLVDTCVGTSVETNSVGISCQPECKNKVVGFELPMNWWIVKER 
VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTEESVOTLTLLKT 
NLNLKEVRS I GCGDCS VD VT VCS PKECAS RGVNTEAVS QVE AAV 
MAVP RTADQDTSTDLEQVHQFTNTETATL I ES CTNTCLS TLDKQ 
TSTQTVETRTVAVGfiGRVKDINSSTKTRSIGVGrLLSGHSGFDR 
PSAVKTKESGVGQIN1NDNYLVGLKMRTIACGPPQLTVGLTASR 
RSVGVGDDPVGESLENPQPQAPLGMMTGLDHYIERIQKLLAEQQ 
TLLAENYSELAEAFGEPHSQMGSLNSQL ISTLSS INSVMKSAST 
EELRNPDFQKTSLGKITGSYLGYTCKOGGLQSGSPLSSQTSQPE 
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ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
CA=Alanine, C=Cysteine, D-Aspartic Acid, 
Glutamic Acid, F=Phenyl alanine, G^Glycine, 
H=Histidine, I=*Xsoleucine, K=Lysine, 
L=Laucine, M^Methionine, N=»Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S-Serine, T-Threonine, V=Valine, 
W=Trypfcophan, Y=Tyrosine, X=Unknown, +^Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QEVGTSEGKPISSIiDAFPTQEGTLSPVWIjTDDQIAAGLVACTNN 
ESTLKS IMKKKDGNKDSNGAKKNLQFVG INGG YETTSSDDSS SD 
ESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAEGHHAVNIEGIi 
KSA2VEDEMQVQECEPEKVE IRERYEI»S EKMLS ACNLLKNT 1ND 
PKALTSKDMR FCLNTIiQHEWFRVSSQKSAI PAMVGDY I AAFEAI 
SPDVLRYVINLADGNGNTALHYSVSHSWFEIVKIjLIiDADVCNVD 
HQNXAG YTP I MLAALAAVEAEKDMR I VE ELFGCGDVNAKASQAG 
QTALMIiAVSHGR I DM VKGLLACGADVN I QDDEGSTALMCASEHG 
HVE I VKLLLAQPGCNGHLEDNDGSTALS I ALEAGHKDI AVLLYA 
HVNFAKAQSPGTPRLGRKTSPGPTHRGSFD 


5417 


27 


4074 


KSQLFCFWGG:<AGDII»SGDQDKEQKDPYFVETPYGYQIiDLDFLK 
YVDDIQKGNTIKRLKIQKRRKPSVPCPEPRTTSGGQGIWTSTES 
LSSSNSDDWKQCPNFIilARSQVTSTPISKPPPPLETSLPFIiTIP 
ENRQLPPPSPQLPKHNIiHVTKTLMETRRRLEQERATMQMTPGEF 
RRPRLAS FGGMGTTSS LPSF VGSGNHNPAKHQLQNGYQGNGDYG 
SYAPAAPTTSSMGSSIRHSPLSSGISTPVTNVSPMHLQHIREQM 
AIALKRLKELEEQVRTIPVLQVKISVLQEEKRQLVSQLKNQRAA 
SQINVCGVRKR3YSAGNASQLEQLSRARRSGGELYIDYEEEEME 
TVEQSTQRI KEFRQL\TADMQALEQKIQDSSCEASS ELRENGEC 
R S VAVGAEE WMNDI VVYHRGSRS CKDAAVGTIiVEMRNCG VS VTE 
AM LG VMTEADKEI E LQQQTI E SLKEKI YRLE VQLRETTHDREMT 
KLKQELOAAGSRKKVDKATMAQPLVFSKVVEAVVQTRDQPWGSH 
MDLVDTCVGTSVETNS VG ISCQPECKNKWG PELPMNWWI VKER 
VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTEESVKDLTLLKT 
NLNLKEVRS I GCGDCS VDVTVCS PKECAS RGVNTEAVSQ VE AAV 
MAVPRTADQI>TS TDLEQ VHQFTNTE TATI> IES CTNTCLS TLDKQ 
TSTQTVETRTVAVGEGRVKDINS STKTRS IGVGTEiLSGHSGFDR 
PSAVKTKESG VGQ INI NDN YLVGLKMRT I ACG P PQT iTVGLTASR 
RSVGVGDDPVGESLENPQPQAPLGMMTGLDHYrERIQKLLAEQQ 
TLLAENYS ELAEAFGE PHS QMGS LNS QL I STLSS INS VMKS AST 
C»llijKNPJJ£fyK.T, SIjGK.1 XCjt&xijGx iCK.CGGJjQbGSPJUSSQTSQPE 
QBVGTSEGKPXSSLDAFPTQEGTLSPVNLTDEKJIAAGIiYACrNN 
ESTLKS IMKKKDGMKDSNGAKKNLQFVGINGGYETTS SDDS S SD 
ESSSSESDDECDVIEYPIiEEEEEEEDEDTRGMAEGHHAVWIEGL 
KSARVEDEMQVQEC^PEK^SIRERYEIjSEKMLSACWLLKNTIND 
PKALTS KDMR FCLNTLQHE WFRVS SQ XSA I PAMVGDY IAAFEAI 
SPDVLRWINLADGNGNTALHYSVSHSNFEIVKXLIJDADVCNVD 
HQNKAGYTP I MlAAIiAAVEABKDMRIVEELPGCGDVNAKASQAG 
QTAIWIJlVSKGRXDMVKGIJiACGADVNIQDDEGSTALMCASEHG 
HVE IVKLLLAQPG CNGHLE DNDG S T AL S lALEAGHKDIAVLLYA 
HVNFAXAQS PGTP RJLGRKTS PG PTHRGSFD 


5418 


24 


1133 


SVPRAGGDMETGAAELYDQALIiG I LQH VGNVQD FLR VTjFGFL YR 
KTDFYRLLRHPSDRMGFPPGAAQAIiVLQVFKTFDHMARQDDEKR 
RQEItEEKIRRKEEEEAKTVSAAAAEKEPVPVPVQEIEIDSTTEL 
DGHQEVE KVQ PPG P VKEMAHGS QEAE APGAVAGAAE VPR\ EP?I 
LPRIQEQFQKNPDSYNGAVRENYTWSQDYTDLEVRVPVPKHWK 
GKQ VS VAIiSS SSI R VAMLEBNGER VLM3GKLTHKINTESS LWS L 
EPGKCVLVNLS KVGE YWWNAI LBGEEP IDIDKINKERSMATVDE 
EEO^\n^RLTFDYHQKLQGKPQSHELKVHEMLKKGWDAEGSPFR 
GQRFDPAMFTfl SPGAVQF 


5419 


1395 


259 


GTHPLDPDL.VSRTSVQGPLMTMACPGMSDTEESPFLiGPRAAEEG 
SESEACEAFGRRKSEEEGRRSDTSGFGRSRKHKVNWKHPERADA 
KDPASL PQC/ LG P / DCVRPAQ PSSKYCS DDCGMKLAANR I YE I L 
PQRIQQWQQSPCIAEEHGKKLLERIRREQQSARTRIiQEMERRFH 
ELEAIILRAKQQAVREDEESNEGDSDDTDLQIFCVSCGHPINPR 
VAIiRHME RCYAK YES QTS FGSMYPTR IEGATRLFCDVYN PQS KT 
YCKRLQVLCPEHSRDPKVPADEVCGCPLVRDVFELTGDFCRLPK 
RQCNRHYCWEKiRRAEVDLERVRVWYIGiiDEIiFEQERNVRTAMTN 
RAGLLALMLHQTIQHDPLTTDLRSSADR 


5420 


117 


1733 


HEAGGACPFKGGASGRLYLSPRIiPRVSVAGCEERPLGWVWVLGG 
GGFLPARPPRAQRHLGFSHAEQSMEAPDYEVIiSVREQLFHBRIR 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
s&quence 


Amino acid segment containing signal peptide 
(A«Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H-Histidine, I=lsoleucine, K=Lysine, 
LsLeucine, (^Methionine, N*Asparagine, 
P=Proline, Q^Glutamine, R^Arginine, 
S^Serine, T»Threonine, V=Valine, 
W»Tryptophan, Y=Tyrosine , X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EC 1 1 STLLFATLY ILCHI FLTRFKKPAEFTI? \GMMKMPPSTRL/ 
LLELCTFTLAI AlfGAVLLL PP2 1 ISN3VLLSLPRNYYIQWLNGS 
LIHGLWNLVFLFSNLSLI FLMP FAYFFTESEGFAGSRKGVLGRV 
YETVVMLMLLTLL\nLGMVWVASAIVl>KNKANilES£iYDFI7EyYLP 
YLYSCISFLGVLLLLVCTPLGLARMFSVTGKLIjVKPRLLEDLEE 
QLYCSAFEEAALTRRICNPTSCWLPLDMELLHRQVIiALQTQRVL 
LEKRRKASAWQRNLGYPLAMLCLLVLTGLSVLIVAIHILELLID 
EAAM PRGMQGTSLGQVS FSXLGS FGAVIQWL I F YLMVSS WGF 
YSS PLFRSLR PRWHDTAMTQI IGNCVCLLVLSSALPVFSRTLGL 
TRFDLLGDFGRFNWIiGNFYIVFLYNAAFAGLTTLCLVKTFTAAV 
RABIi I RAFGERE 


5421 


117 


1733 


NEAGGACPFKGGASGRLYIiSPRLPRVSVAGCEERPIjGWVWVLGG 
GGFLPARPPRAQRHU5FSHAEQSMEAPDYEVLSVREQLFHERIR 

LLELCTFTLAIALGAVLLLPFS I ISNEVtiLSLPRNYYIQWLNGS 
LIHGLWNLVFLFSWLSLIFLMPFAYFFTESEGFAGSRKGVLGRV 
YETVVMI^LLTLL\n^MVWVASAIVDKNKANRESljYDFWEYYLP 
YLYSCISFUSVLLLLVCTPLGLARMFSVTGKLLVKPRLLEDLEE 
QLYCSAFEEAALTRRICNPTSCWLPliDMELLHRQVLALQTQRVTi 
I^KRRKASAWQRWLGYPIiAMriCLLVIiTGDSVLrVAIHILELLID 
EAAMPRGMQGTS LGQVS FSKLGSFGAVIQWLI FYLMVSS WGF 
YSS PLFRSLRPRWHDTAMTQ I IGNCVCLLVLSSALP VFSRTLGL 
TRFDLLGDFG R FNWLGN F Y I VFT» YNAAFAGLTTLCLVKT FTAAV 
RAELIRAFGERE 


5422 


3 


1263 


SCGESLPTWLAGAS RPG IGRKGGAWGGRGGSSPAQVLLS PGP VF 
KAGCNW WHLiSRDQAG VQRCDLGS SQP P PLGFKR FS CLSLPS S WD 
YRSTVLCVSKMEADLSGFNIDAPRWDQRTFLGRVKHFLNITDPR 
TVFVSEREIjDWAKVMVEKSRMGVVPPGTQVEQI.T.YAKKIiYDSAF 
HPDTGE KMNV I GRMS FQLPGGM 1 1 TG FMLQ FYRTMP AVI FWQWV 
NQS FNALVNYTNRNAASPTS VRQMALS Y FTATTTAVATAVGMWM 
LTKKAP PLVGRW VP FAAVAAAKCVN I PMMRQQELI KGICVKDRN 
ENEIGHSRRAAAIG ITQWISRITMSAPGMILLP V rMERLEKLH 
FMQKVKVL/ SAP LQVMbSGCFL I FM VP VACGLF PQ KCELFVS YL 
EPKLQDTIKAKYGELEPYVYFNKGL 


S423 


3186 


905 


GVSMALGEEKAEAEASEDTKAQSYGRGSCRERELDIPGPMSGEQ 
PPRLEAEGGtilSPVWGAEGIPAPTCWIGTDPGGPSRAHQPQASD 
ANREPVAERSEPALSGLPPATMGSGDLLLSGESQVEKTKLSSSK 
EFPQTLS LPRTTI CSGHDADTEDDPSLADLPQALDLSQQPHSSG 
LSCLSQWKSVLS PGSAAQPS S CS ISASSTGSSLQGHQERAEPRG 
GSLAKVSSSLEPWPQEPSSVVGIiGPRPQWSPQPVFSGGnASGL 
GRRRI»SFQAE YWACVL PDSLPPS PDRHS PIjWNFNKE YEDlxLD YT 
YPliRPG PQXj PKKLDS RVPADPVLQDSGVDLDS FSVS PAS TLKSP 
TNVSPNCPPAEATALPFSGPREPSLKQWPSRVPQKQGGMGLASW 
SQLASTPRAPGSRDARWfERREPALRGAKDRLriGKHLDMGSPQL 
RTRDRGWPSPRPEREKRTSQSARRPTCTESRWKSEEEVESDDEY 
LALPARLTQVSS LVS YLGS I STLVTLPTGD I KGQ5 PLE VSDS DG 
PASFPSSSSQSQLPPGAALQGSGDPEGQNPCFLRSFVRAHDSAG 
EGSOSSSQALGVSSGLLKTRPSIiPARLDRWPFSDPDVEGQLPRK 
GGEGGKESL VQC \ VKTFC\ CQLEEL ICWL YNV\ ADVTDHGTPAR 
SNLTSLK\SSLQLYRQFKKDIDEHQSLTESVLQKGEILLQCLLE 
NTP VLEDVLGRIAKQSGELESHADRLYDS 1 LASLDMIAGCTL I P 
DKKPMAAMEHPCEGV 


5424 


3186 


905 


G VSMALGEEKAEAEASEDTKAQS YGRGSCREREIiD IPGPMSGEQ 
PPRIiEAEGGlilSPWGAEGIPAPTCWIGTDPGGPSRAHQPOASD 
ANREPVAERSEPAIiSGLPPATMGSGDLLLSGESQVEKTKIjSSSE 
E FPQTLSIi PRTT I CSGHDADTE DD PS LADLPQAUDLSQQ PHS SG 
I/SCI>SQWKSVLSPGSAAQPSSCSISASSTGSSLQGHQERAEPRG 
GSlAKVSSSLEPVVPQEPSSVVGLGPRPQWSPQPVFSGGnASGL 
GRRRLS FQAEYWACVLPDSLPPS PDRHS PLWNPNKE YEDLIiDYT 
YPLRPGPQLPKHLDSRVPADPVLQDSGVDLDSFSVSPASTIiKSP 
TNVSPNCPPAEATALPFSGPREPSLKQWPSRVPQKQGGMGIiASW 
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NO: 


Predicted 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=»Methionine , N=Asparagine, 
PaPr-oline, Q=Glutamine i R=Arginine, 
S=Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y*» Tyrosine, X=Unknown , *-Stap 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








SQIASTPRAPGSRDARWERREPALRGAKDRLTIGKHLDMGSPQL 
RTRDRGWPSPRPEREKRTSQSARRPTCTESRWKSEEEVESDDEY 
LALPARLTQVS SLVS YLGS I S TLVTLPTGD I KGQSPLEVSDSDG 
PASFPSSSSQSQLPPGAALQGSGDPEGQNPCFLRSFVRAHDSAG 
EGSLGSSQALGVSSGLLKTRPSLPARLDRWPFSDPDVEGQLPRK 
GGEQGKESLVQC\VKTFC\CQLEEIiICWLYKV\ADVTDHGTPAR 

NTPVLEDVLGR I AKQSGELESHADRLYDS ILASLDMLAGCTLIP 
DKKPMAAMEHP CEG V 


5425 


1086 


115 


GFCPS P SLGHQ P PRVLHPTMSMAVETFGFFMATVGLLMLGVTLP 
NS YWRVSTVHGKVITTNT I FENLWFSCATDS LGVYNCWE FPSML 
ALSG Y X QACRALMITAI LLGFLGLLLGIAGLRCTNTGGLELS RK 
AKLAATAGA PH \ ILPG I CGMVAI \S W YA FN ITR\ DFS DPL YPGT 
KYELGPALYLGWSASL X S ILGGLCLCSACCCGSDEDPAASARRP 
YQ Ap VS VM P VATSDQ EGDS SFGKYGRNALRVAALCRGPRCL PTA 
PKKRGPGRGPFPYSNLRGRPRPVPVAPPRPRPRVLHSHGPSQAK 
NCSWEVAYLPSEAGSLIF 


5426 


42 


3435 


ATSSQSLGRADPPRGGTMERSPGEGPSPSPMDQPSAPSDPTDQP 
PAAHAKPD PG SGGQPAGPGAAGEALAVLT S FGRRLLVL I P VYLA 
GAVGLSVGFVLFGtALYLGWRRVRDEKKKSLRAARQLLDDEEQL 
TAKTLYMSHRELPAMVSFPDVEKAEWLNK3VAQVWPFLGQYMEK 
LIAETVAP AVRG SNPHLQT FTFTRVE LGEKPLR 1 1 GVKVHPGQR 
KEQ I LLDLNI S Y VGDVQ I DVEVKKYFCKAGVKGMQLHG VLRVIL 
EPrilGDLPFVGAVSMFFIRRPTLDIJm^TGMTNLLDIPGLSSLSD 
TMIMDSIAAFLVLPNRLLVPLVPDLQDVAQLRSPLPRGIIRIHL 
LAARGLSSKDKYVKGIjIEGKSDPYAU^/R^TQTFCSRVIDEELN 
PQWGETYEVMVHEVPGQEIEVEVFDKDPDKDDFIiGRMKLDVGKV 
IjQASVLDDWFPLQGGQGQVH1(RLEW/jSIjL»SDAEKXjEQVLQWNWG 
VSSRPDPPSAAILWYLDRAQDLPMVTSELYPPQLKKGNKEPNP 
MVQLSIQDVXQESKAVYSTNCPVWEEAFRFFLQDPQSQELDVQV 
KDDSRALTLGALTLPLARLLTApELrLDQWFQLSSSGPNSRLYM 
KLVMRII^YLDSSEICFPTVPGCPGAWDVDSENPQRGSSVDAPPR 
PCHTTPDSQFGTEHVLRIHVLEAQDLIAKDRFLGGLVTCGKSDPY 
VKLKLAGRSFRSHVVREDLNPRWNEVFEVIVTSVPGQELEVEVF 
DKDLDKDDFI^RCKVRLTTVI^SGFLDEWLTLEDVPSGRLHLRL 
ERLTPRPTAAELEEVLQVWSLIQTQKSAELAAAIiLS I YMERAED 
LPLRKGTKHLSPYATL.TVGDSSHKTKTISQTSAPVWDESASFLI 
RKPKTE S LELQ VRGEGTG VLGSLSLPLS ELLVADQLCLDRWFTL 
S SGQGQ VLLRAQLG ILVSQESGVEAHSHS YSHS SS SLSEE PELS 
GGP PH I TS S APE V \ RQRLTHVDS PLEAPAGP LGQVKLTLW Y YS E 
ERKIjVS I VHGCRSLRQNGRDP PDP YVSLLLL PDKNRGTKRRTS Q 
KKRTLSPEFNERFEWEIiPI»DEAQRRKLDVSVKSNSSFMSREREL 
LGKVQLDLAETDLS QG VARW YDLMDN KDKGSS 


5427 


42 


3435 


XT^SQSl*GRADPPR<iGTMBRSfe>GEGPSPSPMDQPSAPSDPTDQP 
PAAHAKPD PGSGGQPAGPGAAGEALAVLTSFGRRLLVL I PVYLA 
GAVGLSVGFVLFGLALYLGWRRVRDEKERSLRAARQLLDDEEQL 
TAKTLYMSHRELPAWVSFPDVEKAEWLNKIVAQVWPFLGQYMEX 
LLAETVAPAVRGSNPHLQTFTFTRVE LGEKPLR I IGVKVHPGQR 
KEQ ILLD LN IS YVGDVQID VEVKKY FCKAGVKGMQLHG VLR VI L 
EPLIGDLPFVGAVSMFFIRRPTLDINWTGMTNLLDIPGLSSLSD 
TMIMDS I AAFLVLPNRLLVPLVPDLQDVAQLRS PLPRGI IRIHL 
LAARGLSSKDKYVKGLIEGKSDPYALVRLGTQTFCSRVIDEELN 
PQWGETYEVMVHEVPGQEIEVEVFDKDPDKDDFLGRMKLDVGKV 
LQASVLDDWFPLQGGQGQVHLRLEMLSLLSDAEKLEQVLQWNVIG 
VSSRPDPPSAAILVVYLDRAQDLPMVTSELYPPQLKKGNKEPNP 
M VQLS I QDVTQE S KAVY STNCP VWEEAFR FFLQD PQS QELDVQV 
KDDSRALTLGALTLPLARLLTAPELILDQWFQLSSSGPNSRLYM 
KLVMR IL YLDSSE ICFPTVPGCPGAWDVDSENPQRGSS VDAPPR 
PCHTTPDSQFGTEHVLRIHVLEAQDLIAKDRFLGGLVKGKSDPY 
VKLKLAGRS FRSHWREDLNPRWNEVFE VIVTSVPGQELEVEVF 
OKDIjDKDDFLGRCKVRIiTTVLNSGFLDEWLTLEDVPSGRLHLRL 
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ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine v C=Cysteine, D=Aspartic Acid, E*= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K» Lysine, 
L=Leucine, M=Methicnine, N-Asparagine, 
P=Proline, Q=Glut amine, R-Arginine, 
S» Serine, T=*Threonine , V^Valine, 
WeTryptophan, YaTyrosine, X^Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








E RLTPRPT AAELEE VLQ VNS L IQTQKSAELAAALiLS I YMERAED 
LPLRKGTKHLS PYATLTVGDSSHKTKTI SQTSAPVWDESASFLI 
RKPHTESLE LQWGEGTG VLG SLSLPLS ELLVADQLCLDRWF TL 
SSGQGOVLLRAQLGILVSQHSGVEAHSHSYSHSSSSLSEEPELS 
GGPPHITSSAPEV\RQRLTHVDSPLEAPAGPLGQVKLTLWYYSE 
ERKLVSIVHGCRSLRQNGRDPPDPYVSLLLLPDKNRGTKRRTSQ 
KKRTLSPEPNERFEWELPLDEAQRRKLDVSVKSNSSFMSREREL 
LGKVQLDLAETDLSQGVARWYDIiMDNKDKGSS 


5428 


3 


1839 


ssrserlsacaiappwlvssrparpaqlqrpgkmvedgaebled 
lvhfsvselpsrgygvmeeirrqgklcdvtlkigdhkfsahriv 
laas i pyfhamftndmmeckqde i vmqgmdpsalealinfayng 
nlai dqqnvcs llmgas flqlqs i kdacct flre rlhpknclgv 
rqfaetmmcavlydaansfihqhfvevsmseeflalpledvlel 
vsrdelnvkseeqvfeaaiawvrydreqrgtfl\rnlqsnxrll 
fcrpqflsdrvqqddlvrcchkcrdlvdeakdyllmperrphlp 
afrtrprccts i agli yavgglns agdslnwevfdp i ancwer 
crpmttarsrvgvawngllyaiggydgqlrlstvqayntetdt 
wtrvgsmnskrsamgt wldgq i yvcggydgnsslssvet ys pe 
tdkwtwtsmssnrsaa\gvtvfegriyvsgghdglqifssveh 
ynhhtatwhpaagmlinkrcrhgaas lgskmfvcggydgsgfls i 
aemyssv\adqwclivpm\htrr\srvslggpavgrlyavwgvt 
tgqsnl\ssvgdvltpetdcwtfm\apmacheggvgvgcipllt 
I 


5429 


923 


202 


RRE DALS SEGG LW PS ES T VSGNG I PE PQ VYAPPRPTDR LAVP P F 
AQF*ERFHRFQPTYPYLQHEIDLPPTISIiSDGEEPPPYQGFCTLQ 
LRDP EQQLELNRES VRAP PKRTI FDS DIjMDS ARLGGPCPPSS NS 
GISATCYGSGGRMEGPPP\TYSEVIGHYPGSSFQHQQSSGPPSL 
LEGTRLHHTHIAPLESAAIWSKEKDKQKGHPL 


543 0 


441 


1507 


qkrrkrrrkkimkt iqpkmhnsiswai ftglaalclfqgvpvrs 
gd atf p kamdnvtvrqge s atlrct i okrvtrvawlnrsti lya 
gndkwcldpr wllsntqtq ys i eiqnvdvydeg p ytcs vqtdk 
hpktsrvhlivqvspkivei ssdisinegnnisltciatgrpe p 
tvtwrhi spkavgfvsede ylei qgitreqsgdyecs asndv\a 
apvWrrvkvtvnyppyiseakgtgvpvgqkgtlqceasavpsa 
efqw ykddkrl i /egkkgvkvenrpflskli ffnvsehdygnyt 
cvasnklghtnasimlfgpgavsevsngtsrragcvwllpllvl 

HLLLKF 


5431 


2 


1312 


AAAAPGSRRRR PLPDRPHMAHG YEAP PP PAPRSFAWRARSKP V \ 
LPGIT inp \TIAEGP SP\TSEGASEANLVDLQKKLEELELDEQQ 
KECRLEAFLTQKAKVGELKDDDFERISELGAGNGGWTKVQHRPS 
GLIMARKLlHLEIKPAIRKQIIREIiQVJLHECNSPYIVGFYGAFY 
SDGEI S I CMEHKDGGSLDQVLKEAKRI PEE ILGKVS IAVLRGLA 
YLREKHQ t MHRD VKPSNTLVNSRGE I KLCDFGVSGQL I DSMANS 
FVGTRSYI^PERIiOGTHYSVQSDlWSKGLSIiVELAVGRYPrPPP 
DAKEUEAI FGRPWDGE EGE PHS XSPRPRPPGRPVSGHGMDSRP 
AMAI FELLD YI VNBF P P KLPNGVFTPD FQEFVNKCL I KNPAE RA 
DLKMLTNHTF I KRSE VBEVDFAGWLCXTLRLNQPGTPTRTAV 


5432 


2 


1312 


AAAAPGSRRRR P LPDRPHMAHGYEAPP P PAPRSPAWRARS KPV\ 
L PGITINP \T I AEGPS P \ TSEGASEANLVDLQKKLE EL ELDEQQ 

GLI MARKLIHLEI KPAIRNQ1 1 RELQVLHECNS PYIVGFYGAFY 
SDGEISICMEHMDGGSLDQVLKEAKRIPEEILGKVSIAVLRGLA 
YLREKHQ IMHRDVKPSNILVNSRGEI KLCD FG VS GQL I DSMANS 
F VGTRS YMAPERLQGTHYS VQSD I WSMGLSLVELAVGRYP I PP P 
DAKELEAI FGRPWDGEEGEPHS ISPRPRPPGRPVSGHGMDSRP 
AMArFEXLDYtVNEPPPKLPNGVFTPDFQEFVNKCLIKNPAERA 
DLKMLTNHTFIXR5EVEEVDFAGWLCKTLRLNQPGTPTRTAV 


5433 


360 


1885 


S VQEDKVGFED PLHLCSWRARACPCTWPHC/ CTGLXjECLGFAGV 
LFGWPSLVFVFKNEDYFKDLCGPDAGPIGNATGQADCKAQDERF 
SLI FTLGSFMNNFMTFPTG YI FDRFKTTVARIi IA I FFYTTATL I 
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ID 
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beginning 
nucleotide 
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corresponding 
to £irst 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A**Alanine, C« Cysteine, D=A3partic Acid, E= 
Glutamic Acid, F*»Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K-Lysine, 
L=Leucine , M=Me thionine , NsAsparagine 4 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine , T=Threonine , V=Valine , 
W^Tryptophan, Y=Tyrosine, X= Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








I AFTSAGSAVUjFLAMPMLT I GGI LFLI TNLQ IGNL FGQHRSTI 
ITLYNGAFDSSSAVFLIIKLLyEKGISLR/VLLHLHLCLQYLAC 
SrHFPPDAPGAHPIPTAPQLQLWPVpWEWHHKGREG/QQLSMKT 
GSYSQRSSFQRRKRPQGQGRSRNSAPSGATL/CSRRFAWHLVWI. 
SVIQLWHYLFIGTLNSLLTNMAGGDMARVSTYTNAFAFTQFGVL 
CAP WNGLLMDRLKQ KYQKEARRTGS S TLAVALCS TVPSLAI/TSI* 
LCLGFALCAS VPILPLQYLTF I LQVI SRS FL YGSNAAFLTIAFP 
SEHFGKLFGLVMALSAWSLLQFP1FTLIKGSLQNDPFYVNVMF 
MIiAIUuTFFHPFLVYRKCRTWKES PS Al A 


5434 


66 


652 


RYAAliIISLIQHKLLWRNQHCSRCVIMSPAQSAGLNWLF/GSGK 
HGPFIjGCSQYPACDYVRPLKSSADGHIVKVLEGQVCPACGANLV 
LRQGRFGMFIGCINYPECBHTBLIDKPDETAIXCPQCRTGHLVQ 
RRSRYGKTFHSCDRYPECQFA1NFKPIAGECPECHYPLLIEKKT 
AOGVfCHFCASKQCGKPVSAE 


543S 


4704 


1597 


PGDSSQRIAEMSNAKERKHAKKMRNQPIWVTLSSGFVADRGVKH 
HSGGEKPFQAQKQEPHPGTSRQRQTRVNPHSLPDPEVNEQSSSK 
GMFRKKGGWKAGPEGTSQE I PKY ITASTFAQARAAE I SAMLKAV 
TQKSSNSLVFQTLPPJiMRRRAMSHWKRLPRRLQEIAQKEAEKA 
VHQKKEHSKNKCHKARRCHMNRTLEFNRRQKKNIWLETHIW1IAK 
RPHMVKKWGYCLGERPTVKSHRACYRAMTNRCLLQDLSYYCCIjB 
LKGKEEE I LKALSGMCN IDTGLTFAAVHCLSGKRQGSLVLYRVN 
KYPREMI^GPVTFIWKSQRTPGDPSESRQLWIWLHPTLKQDILEE 
IKAACQCVEPIKSAVCIADPLPTPSQEKSQTELPDEKIGKKRKR 
KDDGENAKPIKKI IGDGTRDPCLPYSWISPTTGI I ISDLTMEMN 
RFRLIGPLSHSILTEAIKAASVHTVGEDTEETPHRWWIETCKKP 
DSVSLHCRQEAIFELLGGITSPAEIPAGTILGLTVGDPRINLPQ 
KKSKALPNPEKCQDNEKVRQLLLEGVPVECTHSF IWNQDI CKSV 
TENKISDQDI*NRMRSELLVPGSQLILG PHESK1 P ILLIQQPGKV 
TG EDRLGWGS GWDVLI* P KGWGMAFWI PFI YRGVR VGGLKESAVH 
SQ YKRS PNVPGD FPDCPAGMLFAEEQAKNIjLEKYKRRP PAKRPN 
YVKLGTIiAPFCCPWEQLTQDWESRVQAYEEPSVASSPNGKESDIi 
RRSEVPCAPMPKKTHQP5DEVGTSIEHPREAEEVMBAGCQESAG 
PERI TDQEAS BNHVAATGSHLCVLRSRKLLKQLSAWCGPSSEDS 
RGGRRAPGRGQQGLTREACLS ILGKFPRAtiVWVSLSIiLSKGS PE 
PHTMICVPAKEDFLQLHEDWHYCGPQESKHSDPFRSKILKQKEK 
KKREKRQKP\GRASSDGPAGEEPVAGQEALTLGLV?SGPLPRVTL 
HCS RTIiLGFVTQGDFSMAVGCGEALGFVSLTGLLDMIiS SQPAAQ 
RGLVLLRPPASLQYRFAR1AIEV 


5436 


1781 


635 


ASDS X PWSEARTTRKLAQRGCQWSLPERMPLVVFCGLP YSGKSR 
RAEELRVAtiAAEGRAVYVVDDAAVLGAJEDPAVYGDSAREKALRG 
AI*RASVERRLSRHDVVILDSLNYIKGFRYELY\CLARAARTPLG 
LVY CVR PGGP I AGPQVAGANENPGRNVS VS WRPRAE EDGRAQAA 
GSSVLREUiTADSWNGSAQABVPKEliERBESGAABSPAIiVTPD 
SEKSAKHGSGAFYSPELLEALTLRFEAPDSRNRWDRPLFTLVGL 
EEPLPLAGIRSALFENRAPPPHQSTQSQPLASGSFLHQLDQVTS 
QVLAGLMEAQKSAVPGDLLTLPGTTEHLRFTRPLTMAELSRLRR 
QFISYTKMHPNNENLPQLANMFLQYLSQSLH 


5437 


739 


1672 


CQEAASEFGGPLHTPAMFLRRLGGWLPRPWGRRKPMRPDPPYPE 
PRRVD SS S ENS GSDWDSAPETMED VGHPKTKDSGALR VSRAASE 
PSKEEPQVEQLGSKRMDSLKWDQPISSTQESGRLEAGGASPKLR 
WDHVDSGGTRRPGVSPEX3GL\GVPGPGAPLEKPGRREKLLGWIiR 
GEPGAPSRYLGGPEECIK3ISTNLTLHIiLELIiASAXIiAI*CSRPLR 
AALDTIiGLRGPI^sLWLHGLLS FLAAI*HGLHAVLSLLTAHPLHFA 
CLFGLLQALVLAVS LREPNGDEAATDWE SEGItEREGEEQRGDPG 
KGL 


5438 


2443 


1152 


TKPRKRRHQPASQRQRPWSSDSTGDLLARGKGRKEENKGSDRVS 
LAPPSLRRPMMCQSEARQGPBLRAAKWLHFPQriAIiRRRLGQLSC 
MS RPALKLRS WPIaTVLYYLLP FGALR PIiSRVGWRPVSRVAli YKS 
V PTRLLSRAWGRLNQVE LPH W LRR P VYS L Y I WT FGVN MKE AAVE 
DliHHYRNLSEFFRRKLKPCSARPVCGLHSVrSPSDGRILWFGQVK 
NCEVEQVKGVTYSLESFLGPRMCTEDLPFP PAASCDS FKNQLVT 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AfcAlanine, C=Cysteine, D-Aspartic Acid, E« 
Glutamic Acid, P= Phenylalanine, G=Glycine, 
H=Histidine, I«*I$oleucine, K= Lysine, 
L=Leucine, M»Methionine, N^Asparagine , 
P=Proline, Q=G lut amine , R=Arginine, 
S -Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








REGNELYHCV I YLAPGDYHCFHS PTDWT VSHRRHF PGS LMS VNP 
GMAR W I KBL FCHNER WLTGDWKHG FPS LTAVGAT \ NWGS I R I Y 
FDRDIiHTN S PRHS KGS YND FS FVTHTNREGVPMALRGEHLG/QS 
FNLGS T I VLI FEAPKD FN FQLKTGQKI RFGEALG SL 


543 9 


2443 


1152 


tkprkrrhqpasqrqrpws sdstgdllargkgrketenkgsdrvs 
iappslrrpmmcqsearqgpelraakwlhfpqlaiirrrlgqlsc 
msrpalklrsmpltvlyylz.pfgai1rpi1srvgwrpvsrvalyks 
vptrllsrawgrl:?qvelphwlrrpvyslyiwtfgvnmkeaave 
dlhhyrklseffrrklkpqarpvcglhsvispsdgrir^fgqvk 
nceveqvjcgvtyslesflgprmctedlpfppaascdsfknqlvt 
regnelyhcviyiiapgdyhcfhsptdwtvshrrhfpgslmsvnp 

GMARW I KELFCHN3RVVLTGDWKHGF FS LTAVGAT \NWGS IRIY 
FDRDLHTNSPRHSKGSYNDFSFVTHTHREGVPMALRGEHLG/QS 
FKLGSTIVLIFEAPKDFNFQLKTGQKIRFGEALGSL 


5440 


693 


253 


EP I PVTPDHRLVTMTHI V\QTFSPVMS \GQPPNYEMIjKEEQEVA 
MLG APHNPAPPMS T VIHI RS ETS VPDH WWSLFNTL FMNT CCLG 
FIAFAYSVKSRDRKMVGDVTGAQAYASTAKCLNIWAI.IIjGIFMT 
XLLI I IPVLWQAQR 


5441 


2 


2054 


CRDGGKNGFMVS PMKPLE I KTQCSGPRMDPKICPADPAFFS FIN 
NSDLWVANIETGEERRLTFCHQGLSNVLDDPKSAGVATFVIQEE 
FDRFTGYWWCPTASWEGSEGLKTLRI LYEEVDES EVEVIHVPS P 
ALEERKTDSYRYPRTGSKNPKIALKIAEFQTDSQGKIVSTQEKE 
LVQPFS SLFPKVEY I ARAGWTRDGKYAWAMFLDRPQQWLQLVLL 
P PALPI PSTENBEQ \RLASARAVPRNVQPYWYEEVTNVWINVH 
DIFYPFPQSEGEDEI*CFLRANECKTGFC25LYKVTAVLKSQGYDW 
SEPFSPGEGEQSLTNAIWVNEETKLVYFQGTKDTPLEHHLYWS 
YBAAGEIVRLTTPGFSHSCSMSQNFDMFVSHYSSVSTPPCVHVY 
KLSGPDDDPIiHKQPRFWASMMEAAKIFHFHTRSDVRLYGMIYKP 
HALQPGKKHPTVLFVYGGPQVQLVNNSFKGI KYLRLNTLASLGY 
AVWI DGRGS CQRGLR FEGALKNQMGQ VE I BDQVEGLQ FVAE KY 
GFIDLSRVAIHGWSYGGFLSLMGLIHKPQVFKVA1AGAPVTVWM 
AYDTGYTERYMDVPENNQHGYEAGSVALHVEKLPNEPKRLLILH 
GFLDENVHFFHTNFLVSQL I RAGK P YQLQVALPP r VS PQIYPNER 
HS I RC PE SGEHYE VTLLHFLQE YL 


5442 


1 


3474 


" cgqrs r r r s pdm pe ak p aakkap kg kda p kg ap keap p ke ap ae 
apkeappedqsptaeeptgvflkkpdsvsvetgkdavvvakvng 
kelpdkpt i kwfkgkwlielg s ksgarfs fkeshns asnvytve l 
higkwlgdrg yyrl e vkakdtcds cgfni dveaprqdasgqsl 
esfkrtsekkseh:ageldfsgllkkrevveeekkkkkkddddlg 

X PPE IWELLKGAKKSEYEKIAFQYG I TDLRGMLKRLKKAKVEVK 
KSAAFn , KK3*DPAYQVDRGNTKlKLMVEISDPDLrXiKWFKNGQEIK 
P S S KYVFENVGKKR I LT INKCTLADDAAYE VAVKDEKCFT ELFV 
KEP P VL I VTPLEDQQVFVGDRVEMAVE VS EEGAQVMWMKDGVEL 
TREDSFKARYRFKKDGKRHILIFSDWQEDRGRYQVITNGGQCE 
AELIVEEKQLEVLQDIADLTVKASEQAVFKCEVSDEKVTGKVJYK 
NGVEVR PS KRITISHVGRFHKLVI DDVRPEDEGDYTFVPDGYAL 
GSLSAKLNFLEIKVEYVPKQ\EPPKIPLGFASGGKTSEKAD/IV 
VVAGNKLRLDV\SITGEAPSPFAT\WIiKG\DEVFTTTEGRTRIE 
KRVDCSS FVIESAQREDEGRYTIKVTNPIGEOVAS IFLQWDVP 
DPP E AVRI TSVGEDWAI LVWEPPMYDGGKP VTG YLVKRKKKGSQ 
RWMKLNFEVFTETTYESTKMIEGILYEMRVFAVNAIGVSQPSMN 
TKP FMP I APTS3 PLHLI VEDVTDTTTTLKWR P PNR I GAGG I DG Y 
LVEYCLBGSEEWVPANTEPVERCGFTVKNLPTGARI LFRWGVN 
IAGRSEPATLAQPVT IRE I AEPPKI RLPRHLRQTYIRKVGEQLN 
LWPFQGKPRPQVVWTKGGAPIJ3TSR^^^VRTSDFDTVFFVR0AA 
RSDSGE YELS VQ I ENMKDTATIR I RWE KAGPPINVMVKEVWGT 
NAX VEWQAPKDDGNSE IMG YFVQKAD KKTMEWFNV YBRNRHTS C 
TVSDLIVGNEYYFRVYTENICGLSDSPGVSKNTARILKTGITFK 
PFEYKEHDFRMAPKFLTPL I DR WVAG YS AALNCAVRGHP KP K V 
VWMKNKMEIREDPKFLIT17YQGVLTLWIRRPSPFDAGTYTCRAV 
C3ELGEALABCKLEVRVPQ 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, S= 
Glutamic Acid, F« Phenylalanine, G^Glycine, 
H=Histidine, I:=lsoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine , 
P=Proline , Q=Glutamine , R>=Arginine , 
S-S erine, TVThreonine, V=Valine, 
W^Tryptophan, Y*»Tyrosine, X=Unknown, *=Sfcop 
Codon, /*=possible nucleotide deletion, 
\=poseible nucleotide insertion) 


5443 


66 


1003 


SRGQLDAGQS S EQHGGNRQPEQSRSRSSS SSSS PRRSRSAAE PA 
MALSMPLNGLKEEDKEPLIELFVKAGSDGESIGNCPFSQRLFMI 
LWLKGWFSVTTVDLKRKPADLQWLAPGTHPPFITFNSEVKTDV 
NKIEEFLBEVLCPPKYLKLSPKHPESNTAGMDIFAKFSAYIKNS 
RPEANEAIiERGLLKTLQKUDEYLWSP^PDEIElENSMEDIKFSTR 
KFLDGNEMTliADCNMjPKIiHIVKVVAKKYRNFDIPKEMTGIWRY 
LTNAYSRDEFTNTCPSDKEVEI\AYSDVAKRLHQVKSRLLKEVS 
FMSSP 


5444 


2 


344 


SGPIGVTGAQMAKWIiRDYIjSFGGRRPPPQPPTPDx IJc.SDXJjKAx 
RAQKNLDFEDPY*DSESRLEPDPAGPGDSKNPGDAKYGSPKHRL 
IKVEAADMARAKALLGGPGEELEADTEYLDPFDAQPHPAPPDDG 
YMEPYDAQWVMSSLPGRGVQLYDTPYEEQDPETADGPPSGQKPR 
QSRMPQEDERPADEYDQPWEWKKDHISRAFAVQFDSPEWERTPG 
SAKE LRR P PP RS PQ PAERVDPAL P1*E KQP WFHG PLKR ADAES IjL 
SLCKEGSYLVRLSETNPQDCSLSLRSSQGFLHLKFARTRENQW 
LGQHSGPFPSVPEpvr.HYSSRPLPVQGAEHIiALLYPWTQTP*Q 
*PDWGDRRPNGQVATGLPELWGAEAPSAAAHPGLHRERHPEGLP 
RAEKPGLRGPLLGLREPLGAGPRGPWGLQEPRRCQVWFSQAPAH 
QGGGCGYGQSQGPSGRPRGGAGSRH 


5445 


2364 


486 


ILSRGFLGSVEICIQLPLPASEPVIiLJjTWARRRWRETRSRREPT 
TLRAQSVCPWWI*ETRMtJRSIPVEVDESEPYPSQLLKPIPEYSP 
EBESEPPAPNIRNMAPNSLSAPTMLHNSSGDFSQAHSTLKtANH 
QRPVSRQVTCLRTQVLEDSEDSFCRRHPGLGKAFPSGCSAVSEP 
ASESWGALPAEHQFSFMEKRNQWLVSQLSAASPDTGHDSDKSD 
QSLPNASADSLGGSQEMVQRPQPHRNRAGLDLPTIETGYDSQPQ 
DVLGIRQLERPIjPIjTSVCYPQDLPRPLRSREFPQFEPQRYPACA 
QMBPPNIiSPHAPWNYHYHCPGSPDHQVPYGHDYPRAAYQQVIQP 
ALPGQPLFGASVRGLHPVQKVILNYPSPWDQEERPAQRDCSFPG 
LPRHQDQPHHQPPNRAGAPGESLECPAEIJIPQVPQPPSPAAVPR 
PPSNPPARGTLKTSNLPEELRKVFITYSMDTAMEWKFVNFLLV 
NGFQTAIDlFEDRIRGIDriKWMERYLRDKTVMIIVAISPKYKQ 
DVEGAESQLDEDEHGLHTKYIHRMMQIEF1KQGSMNFRFIPVLF 
PNAKKEHVPTWLQNTHVYSWPKNKKNILLRLLREEBYVAPPRGP 
LPTLQWPL 


5446 


972 


161 


" SS WS WCTGRMRKTRljWGLLWMLFVSELRAATiCLTEEKYELKEGQ 
TIiDVKCDYTLEKFASSQKAWQI IRDGEMPKTIiACTERPSKNSHP 
VQVGRIILEDYHDHGLIiRVRMVNLQVEDSGLYQCVIYQPPKEPH 
MLFDRIRLWTKGFSGTPGSNENSTQNVYK1PPTTTKALCPLYT 
TPRTVTQAPPKSTADVSTPDSEINLTNVTDIIRVPVFNIVILLA 
GGFLSKSLVFSVXiFAVTLRSFVP*AHEPTRMSSDFQPHPSGSCA 
KGGGRR 


5447 


207 


617 


MTARTIiSLMASLVAYDDSDSEAETEHAGSFKATGQQKDTSGVAR 
PPGQDFASGTLDVPKAGAQPTKHGSCEDPGGYRIiPIAQLGRSDR 
GSCPSQRLQWPGKEPQVTFPIKEPSCSSLMTSHVPASHMPLAAA 
RF KQ VJUj&KN r PKobr rlAypnolil Vv*itNisoo*J vft.r^i\\*JeiJJ<L- v v fi 
TPRRLRQRQALS TETGKGKD VEPQGP PAGRAPAPL YVG PG VS EF 
I QP YLNSHYKETTVPRKVL FHLRGHRGP VNT I QWCP VLS KSHML 
LSTSMDKTF KVWNAVDSGHCLQTYS LHTE AVRAARWAP CGRR IL 
SGGFDFALHLTDLETGTQLFSGRSDFRITTLKFHPKDHNIFLCG 
GFSSEMKAWDIRTGKVMRSYKATIQCJTLDILFLREGSEFLSSTD 
ASTRDSADRTlIAWDFRTSAKIStlQIFHERFTCPSIiALHPREPV 
FLAQTNGNYLAIiFSTVWPYRMSRRRRYEGHKVEGYSVGCECSPG 
GDLLVTGS ADGRVLM YS FRTAS RACTliQGHTQ AC VGTT YH PVLP 
SVLATCSWGGDMKIWH*AFHWLSLGEAIGDLAPARGY5GPGRSL 
KSPSPSKSLLVIiLCGRAMFQPATCPWQIiPAIjSK 


5448 


194 


1833 


MAS KVTD AI VW YQKK I GAYD QQ I WE KS VE QRE I KGLRN KP KKTA 
HVKPDLIDVDLVRGSAFAKAKPESPWTSLTTKGIVRWFFPFFF 
R WWLQVTS KV1 FFWLL VLYLLQVAA IVh FCSTS S PHS I PLTE VI 
GPI WLMLLLGTVHCQ I VSTRTP KPPL STGGKRRRKLRKAAHLEV 
HREGDGSSTTDNTQEGAVQNHGTSTSHSVGTVFRDLWHAAFFLS 
G S KKAKSTS IDKS TETDNG Y VS LDGKKT VKSGEDG IQNHEPQCET 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A« Alanine, ^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H-Histidine, I=*Isoleucine, K=Lysine, 
L^Leucine, M»Methionine, N-Asparagine, 
P=Proline, Q*Glut amine, R=Arginine, 
SwSerine, T= Threonine, V=Valine, 
W=Tryptophan, Y=> Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








IRPEETAWNTGTLRNGPSKDTQRriTNVSDEVSSESGPETGYSL 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
QDSESARPESETEDVLWEDIjLHCAECHSSCTSETDVENHQINPC 
VKKEYRDDPFHQSKLFWLHSSHPGLEKISAIWEGNDCKKADMS 
VLB I SGMIMNRVNSH I PG IG YQI FGNAVSIiI LGLTP F VFRLSQA 
rDLEQLTAHSASELYVlAFGSNEDVIVLSbavIISFWRVSLVWI 
FFFLLCVAERTYKQVGIM * TSEGVLRNRKSHHYKKHYPNEDAPK 
SGTSCSSRCSSSRQDSESARPESETEDVLWEDLLHCAECHSSCT 
SETDVENHOINPCVKKEYRDDPFHQSHLPWLHSSHPGLEKISAI 
VWEGND CKKADMS VLE I SGM I MNRVNS H I PG IG YQ I FGNAVSLI 
LGLTP FVFRLS QATDLEQLTAHSAS EL Y V I AFGSNEDVT VLSMV 
I IS FWR VSL VWI FFFLLCVAERT YKQ VG IM 


5449 


194 


1833 


MASKVTDAIVWYQKKIGAYDQQIWEKSVEQREIKGLRNKPKKTA 
HVKPDLIDVDLVRGSAFAKAKPES PWTSLTTKG I VRWFFPFFF 
RWWLQVTS KVI FFWLLVL YLLQVAAI VL FCSTSS PHSIPLTEVI 
GPIWLMLLLGTVHCQIVSTRTPKPPLSTGGKRRRKLRKAAHLEV 
HREGDGSSTTDNTQBGAVQNHGTSTSHSVGTVFRDLWHAAFFLS 
GSKKAKNS IDKSTETDNGYVSLDGKKTVKSGEDG I QNHEPQC3T 
IRPEETAWNTGTLRNGPS KDTQRT Z TWVSDE VS S E EGPE TG YS L 
RRHVDRTS EGVLRNRKSHH YK KHYPN EDAPKSGTS CSSRCS S S R 
QDSESARPESETEDVLWEDLLHCAECHSSCTSETDVENHQINPC 
VKKEYRDDPFHQSHLPMLHSSHPGLEKISAIVWEGNDCKKADMS 
VLEISGMIMMRVNSHIPG3GYQIFGNAVSLILGLTPFVFRLSQA 
TDLEQLT AHSAS ELYVI AFG SNEDV I VLS MVI 1 S FWRVSLVWI 
FFFLLC VAERTYKQVG I M * TSEGVIiRNRKSHHY KKHYPNEDAP K 
SGTSCSSRCSSSRQDSESARPESETEDVIiWEDLLHCAECHSSCT 
SETDVENHQINPCVKKEYRDDPFHQSHLPWLHSSHPGLEKISAI 
VWEGNDCKKADMSVLE I SGMIMNRVNS HI PGIGYQ I FGNAVSLI 
LGLTPFVFRLSQATDLEQLTAHSASELYVIAFGSNSDVIVLSMV 
I ISF WRVSLVWI FFFLLCVAERTYXQVGIM 


5450 


B136 


124 2 


™ GQQFAS FFG*NHPEVTVAMALTDIDLQLQFSMSQPEA3XLLAAG 
PADHLLLQLYSGHLQ VRLVIiGQE BLRLQT PAETLLSDS I PHT W 
LTWEGWATLSVDG FLNAS S AVPGAPLE VPYGLF VGGTGTLGL P 
YLRGT S RPLRG CLHAATLNGR SLLRPLTPDVHEG CAEEFSASDD 
VALGFSGPHSLAAFPAWGTQDEGTLEFTLTTQSRQAPLAFQAGG 
RRGDF I Y VDIFEGHLRAWEKGQGTVLLHNSVPVADGQPHEVSV 
HINAHRLEISVDQYPTHTSNRGVLSYLEPRGSIXLGGLDAEASR 
HLQFJ5RLGLTPEATKASLLGCMEDLSVNGQRRGLREALLTKNMA 
AGCRLEEEEYEDDAYGHYEAFSTIAPEAWPAMELPEPCVPEPGL 
PPVFANFTQLLTISPLWAEGGTAWLEWRHVQPTLDLMEAELRX 
SQVLFSVTRGAHYGELELDILGAQARKMFTLLDWNRKARFIHD 
GSEOTSDQLVLEVSVTARVPMPSCLRRGQTYLLP IQVNPVNDP ? 
HirFPHGSLMVILEHTQKPLGPEVFQAYDPDSACEGLTFQVLGT 
SSGLP VERRDQPGE PATEFS CRELEAGSL VYVHCGG PAQDLTFR 
VSDGLQAS P PATLKVVAIRPAIQ IHRS TGLRLAQGSAMPILPAN 
LSVETNAVGQDVSVLFRVTGALQFGELrQKHSTGGVEGAEWWATO 
AFHQRDVEQGRVRYLSTDPQHHAYDTVElTIiALEVQVGQE ILSNL 
SFPVTIQRATVWMLRLEPLHTQNTQQETLTTAHLEATLEEAGPS 
PPTFHYEWQAPRKGNLQLQGTRLSDGQGFTQDDIQAGRVTYGA 
TARAS EAVEDTFRFRVTAPP YFS PLYT FP IH I GGDPDAP VLTNV 
LLWPEGGEG VLSADHL FVKS LNSAS YL YE VMER PRLGRLAWRG 
TQDKTTMVTS FTNEDLLRGRLVYQHDDSETTEDD I PFVATROGE 
S SGDMAWEEVRGVFRVAI QPVNDHAPVQT I SR I FHVARGGRRIiL 
TTDDVAFS DADSGFADAQLVLTRKDLLFGS I VAVDEPTRP I YR F 
TQBDLRKRRVLFVHSGADRGWIQLQVSDGQHQATALLEVQASEP 
YLRVANGS SLWPQGGQGTIDTAVLHLDTNLD I RSGDEVHYHVT 
AGPRWGQLVRAGQP ATAFSQQDLLDGAVLYSHNGS LS P EDTMAF 
SVEAGPVHTDATLQVTrALEGPLAPLKLVRHKKIYVFQGEAAEI 
RRDQLEAAQE AVPPADT VF"SVKS PPSAG YLVM VSRGALADEP PS 
LDP VQS FSQEAVDTGRVL YLHSRPEAWS DAFS LDVASGLGAP LE 
GVLVELBVLPAAI P LEAQNFS VP EGGSLTLAP PLLRVSGP YFPT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I-Isoleueine, K= Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P=>Froline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine , V=Valine, 
{^Tryptophan, Y=Tyrosine , X^Unknovm, *=Stop 
Codon, /=possible nucleotide deletion, 
\=po3sible nucleotide insertion) 








LLGLSLQVLEPPQHGPLQKEDGPQARTLSAFSWRMVEEQLIRYV 
HDGSETLTDS FVLMANAS EMDRQSHP VAFTVTVLP VNDQ PPILT 
1NTGLQMWEGATAP I PAEALRSTDGDSGSEDLV YTIEQPSNGRV 
VLRGAPGTEVRSFTQAQLDGGtiVLFSHRGTLDGGFPFRLSDGEH 
TSPGHFFRVTAQKQVLLSLKGSQTLTVCPGSVQPLSSQTLRASS 
S AGTD PQLLIi YR WRGPQLGRLFHAQQDSTGE AIiVN PTQAE VYA 
GN I LYEHEMPPEPFWEAHDTLELQLS S PPARDVAATLAVAVSFE 
AACPQRPSHL WKNKGLWVPEGQRARITVAAIiDASNLLAS VPS PQ 
RSEHDVLFQVTQFPSRGQLLVSEEPLHAGQPHFLQSQLAAGQLV 
YAHGGGGTQQDGFHFRAHLQGPAGASVAGPQTSEAFAITVRDVN 
ERPPQPQASVPLRLTRGSRAPISRAQLSVVDPDSAPGEIEYEVG 
RAPHNGFLSLVGGGIiGPVTRFTQADVDSGRLAFVANGSSVAGlF 
QLSMSDGASPPLPMSIiAVDILPSAIEVQIiRAPLEVPQALGRSSL 
SQQQLRWSDREEPEAAYRIilQGPQYGHLLVGGRPTSAFSQFQI 
DQGEWFAFTNFS SSHDHFRWuALARGVNASAVVWTVRALI^ 
WAGG P WPQGATLRliDPT VLDAGE LAfJRTGS VPRFRLLEG PRHGR 
WRVPRARTEPGGSQIjVEQFTQQDLEDGRLGLEVGRPEGRAPGP 
AGDSLTLELWAQGVPPAVAS LDFATEP YNAARP Y S VALL S VP EA 
ARTEAGKPESSTPTGEPGPMASSPEPAVAKGGFLSFLEAWMFSV 
IIPMCLV3XLLAXiILPLLFYLRKRNKTGKHDVQVX,TAKPRNGUA 

gdtetfrkvepgqaipltavpgqgpppggqpdpellqfcrtpnp 
alkngqywv 


5451 


1 


2274 


rdsseqgrtgdtlgrpsacmdalkppclwrnhergkkdrdscgr 
knsepgsphslealrdaapsqgi^plllftkmlfifnflfsplp 
tpali ci lrtfgaai fiiwl i trpqp vlpxrldlnnqs vgi eggark 
gvsqknndltsccfsdaktmyevfqrgi*avsdngpclgyrkpnq 
pyrwi,sykqvsdraeylgscllhkgyksspdqfvgifaqnrpew 
i iselacytysmvavplydtlgpkai vh i vmkadiamvicdtpq 
kalvli gnvekgftpslkvi tlmdpfdddlkqrge ksgie ils l 
ydaewlgkehfrkpvppspedlsvicftsgttgdpkgamithqn 
ivsnaaaflkcvehayeptpddvaisylpiahmferivqavvys 
cgarvg ffqgd i rlladdmktlkptlfpavprllnr i ydkvqne 
aktplkkfliiklavsskfkeiiqkgi i rhdsfwdkli faki qds t> 
ggrvrvivtgaapmstsvmtffraamgcqvyeaygqtectggct 
ftlpgdwtsghvgvplacnyvkledvadmnyftvnnegevci kg 

•l'NVFKGYIiKDPEKTQEALDSDGWLHTGDIGRWLPNGTIiKIIDRK 
KNIFKIAQGEYIAPEKIENIYNRSQPVliQIFVHGESLRSSLVGV 
WPDTDVLPSFAAKLGVKGSFEEI.CQNQWREAILEDLQKIGKE 
SGLKTFEQVKAIFLHPEPFSIENGLLTPTLKAKRGELSKYFRTQ 
IDSLYEHIQD 


S452 


1833 


113 8 


SRVPS LCLS LS I»S LS PSREPVAGAPGCGTAGPPAMATL WGGLLR 
LGSLLSLSCLALSVIiIitAQLSDAAKNFEDVRCKCICPPYKENSG 
H I YNKN I SQKD CDCliHWE PM P VRGPDVE AYCJjRCE CKYEERS S 
VTIKVTI 1 1 YLS ILGLLXLYMVYLTIArePIIiKRRkFGHAQLIQS 
DDDIGDHQPFANAHDVIiARSRSRANVLNKVEYAQQRWKLQVQEQ 
RKS VFDRHWL S 


5453 


1X1 


1520 


PSIPAAVPQSAPPEPHREETVTATATSQVAQQPPAAAAPGBQAV 
AGPAPSTVPSSTSKDRPVSQPSLVGSKEEPPPARSGSGGGSAKE 
PQEERSQQQDDIEELETKAVGMSNDGRFLKFDI E IGRGSFKTVY 
KGLDTETTVEVAWCELQDRKLTKSERQRFKEEAEMLKGLQHPNI 

RSWCRQH.KGLQFLHrRTPPIIHRDI^KCDNIFITGPXGSVKIGD 
LGLATLKRAS FAKS VIGTPE FMAP EMY EEKYDES VDVYAFGMCM 
LEMATSE Y P Y SE CQNAAQI YRRVTSGVKPAS FDKVAI PE VKE 1 1 
EGCIRQNKDERYSIKDLLNHAFFQEETGVRVErAEEDDGEKIAI 
KL W LR I E D I KKL KG K YKDNE AI EF S FDLERNVPEDVAQEMVESG 
YVCEGDHKTMAKAI KDRVSLI KRKREQRQIi* 


5454 


111 


1520 


PS IPAAVP QS AP PEPHREETVTATATSQVAQQP PAAAAPGEQAV 
AGPAP S TVPS STS KDRP VSQPS JJVG S KEE P PPARSGSGGGSAKE 
PQEERSQQQDDIEBIiETKAVGMSNI^RFLKFDlEIGRGSFKTVY 
KGIOTBTTVEVAWCEliQDRKLTKSERQRFKEEAEMLKGLQHPNI 
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SEQ 
ID 
NO: 


Predicted" 

beginning 

nucleotide 

location 

cor re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine ( D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I»Isoleucine, K« Lysine, 
L« Leu cine , M-Me thionine , NT«*Asparagine , 
P»Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\*possible nucleotide insertion) 








VRFYDSWESTVKGKKCIVI*VTELMTSGTLKTYLKRFKVMKIKVL 
RSWCRQXLKGLQFIiHTRTPPI IHRDIiKCDNI FITGPTGSVKIGD 
LGIiATLKRASFAKSVIGTPEFMAPEKYEEKYDESVDVYAFGMCM 
LEMATSEYPYSECQNAAQIYRRVTSGVKPASFDKVAXPEVKEII 
EG C IRQNKDER YS I KDLLNHAFFQEETGVR VELAEEDDGE KIAI 
KLWLRIEDI KKLKGKYKDNEAIE FSFDLERNVPEDVAQEMVESG 
YVCEGDHKTMAKAXKDRVSLIKRKREQRQL * 


545S 


1359 


377 


LTMVS PATRKSL PKVKAMDFITSTAI tiPLLFGCLGVFGljFRLLQ 
WVRGKAYLRNAWVI TG ATSGLG KECAKVF YAAG AKLVLCGRNG 
GALEELIRELTASHATKVQTHKPYLVTFDLTDSGAIVAAAAEIL 
- QCFGYVD I LVNKAGI S YRGTIMDTTVDVDKRVMETNYFGPVALT 
KAI^LPSMIKRRQGHIVAISSJQGKMSIPFRSAYAASKHATQAFF 
DCLRAEMEQYE I EVTV I S PG YIHTNLSVNAITADGSRYGVMDTT 
TAQGRSPVE VAQDVLAAVGXKKKDVX LADLL PSLAVYLRTLAPG 
LFFSLMASRARKERKSKNS 


5456 


2 


2332 


CGAGL VAAGAVLVLYPASRAGERTRVP3S PAPSSLPLHS PGACG 
TEVDMDPQRSPLLEVKGNI ELKRPI*I KAPSQLPLSGSRLKRRPD 
QMEDGLEPEKKRTRGLGATTKITTSHPRVPSLTTVPQTQGQTTA 
QKVSKKTGPRCSTAIATGLXNQKPVPAVPVQKSGTSGVPPMAGG 
KKPSKRPAWDLKGQLCDLNAELKRCRERTQTLDQENQQLQDQLR 
nAQQQVKAI^TERTTLHGHIiAXVQAO^EtaGQQELKEILRAC^EL 
EBRLSTQEGLVQELQKKQVELQEERRGIiMSQLEEKERRLQTSEA 
ALSSSQAEVASLRQETVAQAALLTEREERLHGLEMERRRIjHNQL 
QELKGNIRVFCRVRPVLPGEPTPPPGLLIjFPSGPGGPSDPPTRL 
SLSRSDERRGTLSGAPAPPTRHDFSFDRVFPPGSGQDEVFEEIA 
MLVQSAliDGYPVCI FAYGQTGSGKTFTMEGGPGGDPQLEGLI PR 
ALRHLFSVAQELSGQGWTYS FVASYVE I YNETVRDLLATGTRKG 
QGGECE I RRAGPGSEELTVTNARYVPVSCEKEVDALLHtiAROMR 
AVARTAQNERS SRSKS VFQLQ I SGEHS SRGLQCGAPLS LVDLAG 
SERliD PGLALG PGERERLRETQ AINS SLSTLGIjVI MALSNKE S H 
VPYRNSKLTYLLQNSLGGSAKMLMFVNI S PJLE ENV3 ES LNS LRF 
ASKVEPSVLFGTAQSNRKV7KTDPDLCVCVCVCVCVCVCVCVCVP 
MSMYRVRGGRVAGGCFIGHRAPCPRAIK 


5457 


2 


1540 


DDFVERRRWTRTTCbVRS pphvpvcghacswnggs ldplkgt PA 
LLRSAERLMRKVKKLRUDKENTGSWRSFSUJSEGAERMATTGTP 
TADRGDAAATDDPAARFQVQKHSWDGLRS I IHGSRXYSGLI VNK 
APHDFQFVQKTDESGPHSHRLYYIiGMPYGSRENSL^YSEIPKKV 
RKEALLLLSWKQMI/DHFQATPHHGVYSREEELLRERKRLGVFGI 
TSYDFHSESGLFLFQASNSLFHCRDGGKNGFMVSPGPGCVSPMK 
PIiEIiKTQCSGPRMDPKICPADPAFFSFXNNSDLWVANIBTGEER 
RLTFCHQGLSNVLDDPKSAGVATFVIQEEFDRFTGYWWCPTASW 
EGSEGLKTLRILYEEVDESEVEVIHVPSPALEERKTDSYRYPRT 
GSKNPKIAI*KLAEFQTDSQGKXVSTQEKELVQPFSSLFPKVEYI 
ARAGWTRDGKYAWAMFLDRPQQWLQLVLLPPALFIPSTENEEQA 
ASLCOSCPQECPAVCGVRGGHQRIiDQCS 


5458 


6642 


4022 


FVPGLREPQWEPAQPSATMSAPSEEEEYARLVMEAQPEWLRAEV 
KRLSEEIxAETTREK I QAAE YGIAVIiEE KHQLKtjQ FEELEVD YE A 
IRSEMEQLKEAFGQAHTNHKKVAADGESREESLIQESASKEQYY 
VRKVLEUJTELKQLRNVLTNTQSENERLASVAQELKEINQNVEI 
Q RGRLRDD I KE YKFR E ARLLQD YS ELEEENISI^QKQVSVLRQNQ 
VEFEGLKHEIKRLEEETEYLNSQIiEDAIRLKEISERQtiEEALET 
LKTEREQKNSLRKELSHYMSINDSFYTSHI/HVSLDGLKFSDDAA 
E PNNDAEALVNG FEHGGLAKLPLDNKTSTP KKEGLAP PSPSLVS 
DLLSELNISEIQKLKQQLMQMEREKAGLIiATLQDTQKQLEHTRG 
SLSEQQEKVTRLTENLSALRRIjQASKERQTALDNEKDRDSHEDG 

dyyevdi ngp eilackyhvavaeagelre qlkalrsthe areaq 
haeekgryeaegqaltekvsllekasrqdredlarlejcelkkvs 

DVAGETQGSLSVAQDELVTFSEEIiANLYHHVCMCNNETPNRVMIj 
D Y YREGQGGAGRTSPGGRTS PEARGRRS P I LLPKGIiLAPEAGRA 

dggtgds s pspgsslps plsdprrepkni ynli AX I RDQIKHLQ 

AAVDRTTELSRQRIASQELGPAVDKDKEALMEEILKLKSIiLSTK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I^Isoleucine, K= lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S =-Serine , T~Threonine , V«Val ine , 
W*Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








REQ ITTLRT VZj KAN KQTAEVALAN hKS KYENEKAMVTETMMKLR 
NEliKALKEDAATPS S LRAMFATRCDE Y ITQLDEMQRQLAAABDE 
KKTLMSLLRMAIQQKLALTQRLELLELDHEQTRRGRAKAAPKTK 
VATPSVS HTCACASDRA EGTGLANQVFCS EKHS X YCD 


5459 


316 


1262 


RGGHRLSGMASNFND I VKQGYVRIRSRRLGI YQRCWLVFKKAS S "" 
KGPKRLEKFSDERAAYFTiCYKKVTELNNVKNVARIjPKSTKKHAI 
GI YFNDDTS KTFACESDLEADEWCKVLQMECVGTR INDI SLGEP 
DLIATGVEREQSERFNVYI^IPSPNLGCYMGECALQITYEYICLW 
DVQNPRVKL I S WPLSALRR YGRDTTWFTFE AGRMCE TGEGLFI F 
QTRDGEAI YQKVHSAALAI AEQHERLLQS VKNS MLQM KMS ERAA 
S LS TM VPLPRS A YWQHI TRQHS TGQL YRLQ DVS S PLKLHRTET F 
PAYRSEH 


5460 


45 


2097 


RPGCRAGELSTGSRARERVRNRVSAPCGQDSRRCDPEVliRGRSP 
GLGLAEMPS CGACTCGAAAVRL I TS S LAS AQRG I SGGRIHMS VL 
GRLGTFETQ1LQRAPLRSFTETPAYFASKDGISKDGSGDGNKKS 
ASEGSSKKSGSGNSGKGGNQLRCPKCGDLCTHVETFVSSTRFVK 
CEKCHHFFWLSEADSKKSIIKEPESAAEAVKIiAFQQKPPPPPK 
KI YNYLDK YVVGQS FAKKVtiS VA VYNH YKR I YNNIPANLRQQAE 
VEKQTSLTPREI.EIRRREDEYRFTKIiLQIAGISPHGNAIiGASMQ 
QQVNQQIPQEKRGGBVLDSSHDDIKI*EKSWII»LLGKlX3SGKrLL 
AQTLAKCLDVPFAICDCTTLTQAGYVGEDIESVIAKLLQDANYN 
VEKAQQGIVFtiDEVDKIGSVPGIHQLRDVGGEGVQQGIiLKLLEG 
T I WV PEKNS R KLRG E T VQ VDTTN I !■ FVAS G AFNGLD R I ISRRK 
NEKYLGFGTP SNLGKGRRAAAAADLANRSGESNTHQDI EEKDRL 
LRHVEARDL1 EFGMI PBFVGRLPVWPLHSLDEKTLVQILTEPR 
NAVIPQYQALFSMDKCELim^DAliKAXARlALERKTGARGLRS 
IMEFCLIiLBPMFEVPNSDlVCVEVDKEVVEGKKEPGYIRAPTKES 
SEEEYDSGVEEEGWPRQADAANS 


5451 


1481 


160 


inpppppkspcgrarkwrrrrrpgapeaavmelpsgpgperCfB - " 

shrlpgdcflllvlllyapvgfcllvlrlflgihvflvscalpd 

svlrrfwrtmcavlglvarqedsglrdhsvrvli5nhvtpfdh 

NIVNLLTTCSTPLLNSPPSFVCWSRGPMEMNGRGSLVESLKRFC 
ASTRLPPTPXiLIjFPEEEATNGREGLLRFSSWPFSIQDWQPIjTL 

qvqrplvsvtvsdaswvs ejblwsl fvp ftvyqvr w lrpvh rqlg 
eaneefalrvqqlvakelgqtgtrltpadkaehmkrqrhprlrp 
qsaqssfppspqpspdvqlatlaqrvkevlphvplgviqrdiak 
tgcvdltitnllegavafmpeditkgtqslptasasbofpssgpv 
tpqptaltfaksswarqest»qerkoalyeyarrrfrerraqead 


5462 


663 


3353 


KIKERQMSANKS PPSAQKS VLPTAI PAVLPAASPCSSPKTGLSA " 
RLSNGS FSAPS LTNSRGS VHTVS FLLQ I GLTRE S VT I EAQE LS L 
S AVKDliVCS XVYQKFPE CGFFGM YDKIIiLFKHDMNSENl LOlt I X 
SADEIHEGDLVEVVLSAIATVEDFQIRPHTLYVHSYKAPTFCDY 
CGEMLViGL.VRQQLKCEGCGhNYHKRCAFKIPmiCSGVRKRRLStJ 
VSLPGPGDS VPRPLQPEYVALPSEESHVHQEPS KRI PSWSGRPI 
WMEKMVMCR VKVPHTFAVHS YTRPTI CQ YCKRLI»KGLFRQGMQC 
KIX^FNCHKRCASKVPRDCLGEVTFNGEPSSLGTDTDIPMDIDN 
KDINSDSSRGLDDTEEPSPPEDKMFFLDPSDLDVERDEEAVKTI 
SPSTSNWI PLMRWQS1KHTKRKSSTMVKEGWMVHYTSRDNLRK 
RHYWRLDSKCLTLFQNESGSKYYKEI PLSEILRI SS PRDFXNI S 
QGSNPHCFE I ITDTMV YF VGENNGDSSHNP VLAATG VGLD VAQS 
WEKAIRQALMPVTPQASVCTS PGQGKDH KDLSTS IS VSMCQ IQE 
NVDISTVYQIFADEVLGSGQFGIVYGGKHRKTGRDVAIKVIDKM 
RFPTKQESQLRNEVAILQNLHHPGIVNLECMFETPERVFWMEK 
LHGDMLEMILSSEKSRI1PERITKFMVTQILVAI1RNLHFKNIVHC 
DLKPENVLLASAEPFPQVKIiCDFGFARl IGEKSFRRSWGTPAY 
LAPEVLRSKGYNRSLDMWSVGVItYVSliSGTFPFNEDEDINDQI 
QMAAFMYPPNPWREISGEAIDX/IWNLLQVKMRKRYSVDKSLSHP 
WLQDYQTWLDLREPETRIGERyiTHESDDARWElHAYTHNLVYP 
KHFIMAPKTPDDMEBDP 


5463 


237 


1012 


LLSVTMTTSRCSHLPEVIiPDCTSSAAPWKTVEDCGSLVNGQPQ 
Y VMQ VS AKDG Q LLS TWRTIiATQS P FNDR PMCR I CH EGS S QEDL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AWQanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H=Histidine, I»Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P* Proline, Q^Glutamine, R=Arginine, 
S=Serine, ^Threonine, V^Valine, 
W^Tryptophan, Y=Tyrosine, X«Unknown, *=5top 
Codon, /=poasible nucleotide deletion, 
\«possible nucleotide insertion) 








LSPCECTGTLGTIHRSCLEHWLSSSNTSYCELCWFRFAVERKPR 
P L VEWLRNPGPQHE KRTLFGDM VCFLF I TPLATISG WIiCLRGAV 
DHLHFS S RLE AVGL I ALTVAL FT I YLFWTLVS FRYHCRL YNE WR 
RTNQRVILIil PKS VNVPSNQP S LLGLHS VKRNS KET W 


5464 


195 


S77 


SPSMNPRKKVDLKIjIIVGAIGVGKTSLLHQYVHKTFYEEYQTTL 
GASILSKI 1 1 LGDTTIiKLC I WDTGGQERVRSM VSTF Y KG S DGC 2 
LAFDVTDLESFEAXjDIWRGDVLAKIVPMEQSYPMVLLGNKIDLA 
DRK VQS I LBNHLiTES I KDSPDQSRSRCC 


5465 


5278 


3348 


KGD P RE F I RVH REALECDYVS AHLHEW I D L I FG YKQQG PAAVEA 
VNVFHHLFYEGQVDIYNINDPLKETATIGFINNFGQIPKQLFKK 
PHPPKRVRSRLNGDNAGI SVLPGSTSDKI FFHHUDNLR PSLT P V 
KELKEPVGQIVCTDKGXliAVEQHKVLlPPTWNKTFAWGyADIiSC 
RLGT YES D KAMTVYE CLS EWGQ ILCAI CPNPKLV I TGGTSTWC 
WEMGTSKEKAKTVTLKQALLGHTDTVTCATASIiAYHIIVSGSR 
DRTCI I WDLNKLSFL TQLRGHRAP VS ALCINELTGD I VS CAGT Y 
IHVWS I NGN P I VS VNTFTGRS QQ 1 1 CCCMSEMNEWDTQNVIVTG 
HSDGWRFWRMEFLQVPETPAPEPAEVLEMOEDCPEAQIGQEAQ 
DEDSSDSEADBQSISQDPKDTPSQPSSTSHRPRAASCRATAAKC 
TDSGSDDSRRWSDQLSLDEKDGFIFVNYSEGQTRAHLQGPLSHP 
H PNP I E VRNY3 RLKPG YRWERQliVFRS KLTMHTAFDRKDNAHP A 

GGDSCSGCSVRFSLTERRHHCRNCGQLFCQKCSRFQS E I KRIOCI 
SS PVRVCQNCYYNLQHERGSEDGPRNC 


5466 


3 


992 


HACAHASAHASGRLVRWt^KRRSVMGIQTSPVLIiASLGVGLVTIj 
LGLAVGSYLVRRSRRPQVTIiLDPNEKYLLRLLDKTTVSHNTKRF 
RFALPTAHHTLGLPVGKHIYIiSTRIDGSLVlRPYTPVTSDEDQG 
YVDLVIKVYLKGVHPKFPEGGKMSQYLDSLKVGDVVEFRGPSGIj 
LTYTGKGHFNIQPNKKSPPEPRVAKKLGMIAGGTGITPMLQIiIR 
AI LKVPED PTQCFLL FANQTEKD 1 1 LREDLEE LQARY PNRF BCLW 
FTLDHPPKDWA YS KG FVTADM I R EHL P APGDD VL VL*LCG P PPMV 
QLACH PNLDKLG YSQKMRFTY 


5457 


2103 


4 


GEALRVGTRGCRRDLPD PQARI F I QKKDLEEDESVTAAH E*KS RG 
RSPRKIDQFCNSSNMVHGSVTFRDVAIDFSQBEWECLQPDQRTIj 
YRD VMLEN YSHL I SLAGS S I SKPDVITLLEQEKEP WMVVR KETS 
RRYPDLELKYGPEKVS PEWDTSEVNLPKQVIKQ I STTLGIEAFY 
FRNDSE YRQFEGLQGYQEGNINQKM I S YEKLPTHTPHASL I CNT 
HKPYECKECGKYFSCGSNLIQHQSIHTGEKPYKCKECGKAFQLH 
XQLTRHQKFHTGEKTFECKECGKAFNLPTQLNRHKNIHTVKKLF 
ECKECGXS FNRSSNLTQHQS IHAGVKP YQCKBCGKAFNRGSNLI 
QHQKIHSWEKPFVCKEOTMAFRYHYQLIEHCQIHTGEKPFECKE 
CXjKAFTLEjTKIiVRHQKIHTGEKPFECRECGKAFSIjIjNQLNRHKN 
IHTGEKPFECKECGKSFNRSSNLVQHQS IHAGI KP YECKECGKG 
FNRGAHLIQHQKIHSNEKPFVCRECEMAFRYHCQLIEPISRIHT^ 
DKPFECQDCX3KAFNRGS SLVQHQS IHTGEKPYECKECGKAFRLY 
LQLSQHQKTHTGEKPFECKECGKFFRRGSNLNQHRSIHTGKKPF 
ECKECGKAFRLHMHLIRHQKLHTGEKPFECKECGKAFRLHMQLI 
RHQKLHTGE KP FECKECG KVFSLPTQLNRHKN I HTGEKAS 


5468 


225 


2976 


S FLTDLFQ S LAQLENLCKQL YETTDTTTRLQ AEKALVEFTNS PD 
CLSKCQLLLERGSSSYSQLIiAATCLTKLVSRTNNPLPLEQRIDI 
RNYVLK YLATRPKLATFVTOAL IQL YAR I r J*KLGWFDCQKDDYVF 
RNATTDVTRFLQDSVEYCIIGVTILSQLTNEINQVSATAFLIEA 
DTTHPLTKHRKI AS S FRDSSLFDIFTLSCNLLKQASGKNLNLND 
ES QHGLLMQLIiKLTHNCLNFDF IGTS TDESSDDLCTVQ IPTS WR 
SAFLDSSTIiQLSTIGRCEYEKTCALLVQLFDQSAQSYQELLQSA 
SASPMDIAVQEGRLTWLVYIIGAVIGGRVSFASTDEQDAMDGEI* 
VCRVLQLMNLTDSRIiAQAGNEKLErtAMLSFFEQFRKlYIGDQVQ 
KS SKL YRRLS EVIiG LNDETMVLS VFI G KI I TNLKY WGRCEP ITS 
KTLQLLNDLS IG YSS VRKLVKLSAVQFMLNNHTSEHFS FLGINN 
QStn^TDMRCRTTFYTALGRLLMVDLGEDEDQYEQFMLPIiTAAFE 
AVAQMFSTNS FNEQEAKRTLVG LVRDLRG IAFAFKAKTS FMMLF 
E WI YPS YMP I LQRAIEIjWYKDPACTT PVLKLMAELVHNRSQRLQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine , G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine , R-Arginine, 
S=*Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown , *=Stcp 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








FDVSS PNG I LLFRETS KM I TM YGNRI LTLGB VPKDQ V YALKL KG 
IS ICFSMLKAALSGS YVNFGVFRLYGDDALDNALQTFI KLLLS I 
PHSDLLD Y P KLSQS Y YS LLEVLTQDHMNF XASLEPHV I MY I IjS S 
ISEGIjTALDTM VCTGCCS CLDH I VT YIjFKQLSRS TKKRTTP LNQ 
ESDRFLHIMQQHPEMIQQMLSTVLNIIIFEDCRNQWSMSRPLLG 
LI LLNEKYFS DLRNS I VNS QP PEKQQ AMHL CFEN LMEG I ERNLL 
TKNRDRFTQNLSAFRREVNDSMKNSTYGVWSNDMMS 


5469 


134 


2653 


DQEFErSLVPWHLPMGWLCSGLLFPVSCLVLLQVASSGNMKVLQ 
E PTC VSD YMS I S TCBW KMNGPTNCS TELRL LYQLVFLLS EAHTC 
VPENNGGAGCVCHLtiMDDWSADNYTLDLWAGQQIiLWKG S FKPS 
EH VKPRA PGNLTVHTNVS DTLLLTJf SNP YPPDN YLYNHLT YAVN 
IWSENDPADFRI YNVTYLEPSLRI AASTLKSGIS YRARVR AWAQ 
CYNTTWSEWSPSTKWHNSYREPFEQHLLLGVSVSCIVILAVCLL 
CYVS ITKIKKEWWDQI PNPARSRLVAI I IQD AQGSQWE KRS RGQ 
E PAKC PHWKNC LTKLLP CFLEHNMKRDEDPHXAAKEMPFQGSGK 
SAWCPVE ISKTVLWPES ISWRCVELPEAPVECESEEEVEEE KG 
SFCASPESSRDDFQEGREGIVARLTESLFLDLLGEENGGFCQQD 
MGES CLLPPS GSTSAHMPWDEFPS AGP REAP PWGKEQPLHLE PS 
PPASPTQSPDWIiTCTBTPLVIAGNPAYRSFSNSLSQSPCPRELG 
PDPLLARHLEEVEPEMPCVPQLSEPTTVFQPEPETWEQILRRSV 
LQHGAAAAP VSA PTSGYQE FVHAVEQGGTQAS AVVGLGPPGEAG 
YKAFS SliLAS S AVSPEKCGFGASSGEEGYKPFQD^I PGCPGDPA 
PVPVP LFTFGLDREPPRS PQ5SHLPSSSPEHLGLEPGE KVEDMP 
KPPLPQEC2ATDPLVDSLCSGIVYSALTCHLCGXTIjKQCHGQEDGG 
QTPVMASPCCGCCCGDRASPPTTPLRAPDPSPGGVPIiEASLCPA 
SLAPSGISEKSKSSSSFHPAPGNAQSSSQTPKIVNFVSVGPTYM 
RVS 


5470 


17 


1418 


TACRI R TSLNRG I AAVK KDA VEM LAS YGI±A YSLMKFFTGPMS DF 
KNVGLVFVNSKRDRTKAVLCMWAGAIAAVFHTLIAYSDLGYYI 
INKLHHVDESVGSKTRRAFLYLAAFPFMDAMAWTHAGILLKHKY 
S FL VG CAS I S D VI AQ WFVAILLHSHLE CREPLLI P 1 LSLYMG A 
LVRCTTLCLGYYKNIHDI I PDRSGPBLGGDATI RKMLSF WWPLA 
LI LATQRI SRP I VMLF VSRDLGG SS AATEAVAI LTAT YPVGHM P 
YGWLTE IRAVYPAFDKNNPSNKLVSTSNTVTAAH I KKFTFVCMA 
LSLTLCFVMFWTPNVSEKI L IE>I IGVDFAFAELCWPLK I FS FF 
P VP VTVRAHLTGW LMTLKKT FVLAPS SVLRI IVLI ASLWLPYL 
GVHGATLGVGSIiLAGFVGESTMDAIJ^ACYVYRKQKKKMENESAT 
EGEDS AMTDMPPTEE VTDI VEMREENE 


S471 


1868 


658 


RSSAPPGPQRAAAATAAAAAAGVEMAAAAAQGGGGGEPRRTEGV 
GPG VPGEVEMVKGQ PFDVG PRYTQLQ Y I GEGAYGMVS SAYDHVR 
KTRVAI KKI SPFEHQTYCQ RTLRE IQ I LLRFRHENV I G XRDI LR 
AS TLEAMRDVY I VQDLMETDL YKLLKSQQLSNDH I C YFL YQI LR 
GLKY 1 HSANVLHRDL.KPSNLLINTTCDLKICDFGIARI ADPEHD 
HTGFLTEYVATRW YRAPE I MLNS KGYTKS IDIWS VGCILAEMLS 
NR P I FPG KH YLDQLNHI LG ILGS PSQEDLNCI INMKARNYLQS L 
PS KTKVAWAKLFP KS DS KALDLLDRMLTFNPNKR I TVEEALAH P 
YLEQYYDPTDEPVAEEPFTFAMSLDDLPKERLKELIFQETARFQ 
PGVLEAP 


5472 


1463 


753 


LYVMARYLSDEEVAVS IDRLCKANGRS PS I P FGTVRI PGRARVR 
DPQALWI FG YGSLVWRPDFAYSDSRVG FVRGYSRRFWQGDTFHR 

VLGGYDTKEVTFYPQDAPDQPLKALAYVArPQNPGYLGPAPEEA 
IATQILACRGPSGHNLEYLLRVRDVMQLCGPQAQDEHLAAIVnA 
VGTMLPCFCPTEQALALV 


54 73 


3 


2119 


FMNVKLL I QDLED I EQR VPVMDAQ YKI I T KTAHLI T KES PQEEG 
KEMFATMSKLKEQLTKVKECYS PLL YE5QQLL I PLBELEKQMTS 
PYDSLGKINEIITVLEREAQSSALFKQKHQELLACQENCKKTLT 
L I EKGSQS VQKFVTL SNVLKH FD 0/TRLQRQ IAD IHVAFQSMVKK 
TGDWKKHVEIWSRLMKKFEESRAELEKVLRIAQEGLEEKGDPEE 
LLRRHTEFFSQLDQRVLNAFLKACDELTDILPEQSQQGLQEAVR 
KLHKQWKDLQGEAPYHLLHLKIDVEKNRFLASAEECRTELDRET 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
reciduc of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
{A*=Alaniixe, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=* Phenyl alanine, G*=Glycine, 
H=Histidine, I=*Isoleucine, K-Lysine, 
L=*i,eucine, M=Methionine, N=Asparagine, 
P=Proline, Q»Glutamine, R^Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KLMPQEGSEKrrKEHRVPFSDKGPHHLCBKRIiQLIEBLCVKLPV 
RDPVRDTPGTCHVTLXELRAAIDSTYRKLMEDPDKWKDYTSRFS 
EFSSWISTNETQLKGIKGEAIDTANHGBVKRAVEEIRNGVTKRG 
ETtiSWLKSRLKVLTE^SSENEAQKQGDEIJUajSSSFKAIiVTliLS 
EVEKMLSNFGDCVQYKEIVKNSLEELISGSKEVQEQAEKILDTE 
NLFE AQQLLLHHQQKTKR I SAKKRD VQQQ I AQAQQG EGGL ? DRG 
HBELRKLESTLDGLERSRERQERR IQVTLRKWER FETNKETWR 
YLFQTGSSHERFLS FSSLESLSSELEQTKE FSKRTES I AVQAEN 
LVKEASEI PLGPQNKQLLQQQAKS I KEQ VKKLEDTLEEE YV I DK 

s 


5474 


2 


780 


TPDVRQLQASRRGIAVASWCSPRWFAGEEMAFVKSGWLLRQSTI 
LKRWKKNWFDLWSDGHL1YYDDQTRQNIEDKVHMPMDCINIRTG 
QECRDTQPPDGKSKDCMLQIVCRDGKTrSLCflESTDDCrAWKPT 
LQDS RTNTAY VG S AVMTDETS WS S PPPYTAYAAPAPE VGRTLS 
LQQAYGYGPYGGAYPPGTQWYAANGQAYAVPYQYPYAGLYGQQ 
PANQVI TRERYRDNDSDLALGMIAGAATGMALGSLFWVF 


5475 


2 


506 


ARG WLESLS IxTCQTTPPPS S PCLIiHS PETF I HTMP PNLTG YYRF 
VSQKI^EDYLQAIiNISlAVRKIAIiLLKPDKEIEHQGNHMTVRTL 
STFRNYTVQFDVGVEFEEDLRSVDGRKCQTIVTWEEEHLVCVQK 
GEVPNRGWRHWLEGEMLYLEI/TARDAVCEQVFRKVR 


S476 


132 


1457 


SDSMS LI*DCFCTSRTQVE S LRPEKQSET S IHQ YL VDE PTL S WS R 
PSTRASEVLCSTNVSHYELQVEIGRGFDNLTSVHLARHTPTGTL 
VTIKlTNIiENCNEERLKAIiQKAVILSHFFRHPNITTYWTVFTVG 
S WLWVI S P FMA YGS ASQLLRT YFPEGMS ETL IRNI ZiFGAVRGLN 
YLHQNGCIHRS I KASHI L I SGDGISVThS GLS HLH S Ii VXHGQ RHR 
AVYDFPQFSTSVQPWLSPELLRQDLHGYNVKSDIYSVGXTACEL 
ASGQVPFQDMKRTQMLLQKLKGPPYSPLDISIFPQSESRMKNSQ 

Qrj\l , nQfZTf3T7Q\/r.\/Car ,f T T U'T r t7MOr>DT TJT*Ti O PVPUn Ti Tv r»noY Trm r+ 

oiTVUobJ.«iloVbvaa»jiin 1 vrJolJKAjxil SFAc bS»I»VQL»C 

LQQD PE KR PSAS S LLSHVF FKQMKEESQDS ILSLLP PAYNKPS 1 
SLPPVLPWTEPECDFPDEKDSYWEF 


5477 


3 


1044 


RGNSRLRYSHEDELQLPRLPELFETGRQIibDEVEVATEPAGSRI 
VQEKVFKGLDLIiEKAAEMIjS QLDh FSRNEDLEE IAS TDLKYLL V 
P AFQGALTMKQVN P S KRLDHLQRAREH F INYLTQ CHC YHVAE F 3 
I^KTMNKSAEl^TANSSMAYPSLVAMASQROAKIORYKQKKELE 
HRLSAMKSAVESGOAQDERVRF Wr.TtWT.nPWTnT QT.PTTTt?c 
E I KILRERDSSREAS TSNS SRQERPPVKPF ILTRNMAQAKVFGA 
GYPSLPTMTVSDWYEQHRKYGALPDQGIAKAAPEEFRKAAQQQE 
EQEEKEEEDDEQTLMRAREWDDWKDTHPRGYGNRQNMG 


5478 


2 


835 


KTVRI WVPNVKGESTVFRAHTATVRSVHFCSDGQS FVTASDDKT 
VKVWATHRQKFLFSLSQHINWVRCAKFSPDGRLIVSASDDKTVK 
LWDKSSRECVHSYCEHGGFVTYVDFHPSGTCIAAAGMDNTVKVW 
D VRTHRLL QH YQ L H£ AAVNG LS FHPSGNYLI TASSDSTLKILDL 
M EGRLL YTLHGHQGPATTVAFS RTGEY FAS GGSDEQVMVWKSNF 
D IGDHGEVTKVPRPPATLASSMGNLTVS ILEQRLTIiEEDKLKQC 
LBNQQIjIMQRATP 


5479 


2 


835 


ktvriwvpnvkgestvfrahtatvrsvkfcsdgqsfvtasddkt 

VKVWATHRQ KFIiFSLSQHINWVR CAKFS PDGRIiIVSASDDKTVK 
LWDKSSRECVHSY CEHGGFVT YVDFHPS GTC I AAAGMDNT VKW7 
DVRTHRLLQHYQLHSAAVNGLS fhpsgnyli tassdstucildl 
MEGRLLYTLHGHQGPATTVAFSRTGEYFASGGSDEQVMVWKSNF 
DI GDHGEVTKVPRP PATIiASS MGNLTVS I LEQRLTLEEDKLKQC 
LENQQLIMQRATP 


5480 


444 


1952 


LSIjTSRMEEAEbVKGRtiQAITDKRKIQEEISQKRLKIEEDKLKH 
QHLKKKALREKWLLDG ISSGKEQEEMKKQNQQDQHQ I QVLEQS I 
LRIiEKEIQDLEKAELQISTKEEAIIJCKLKS IERTTEDI IRSVKV 
EREERAEES IEDIYANI PDLPKSYIPSRLRKEINEEKEDDEQNR 
KALYAMEIKVEKDLKTGESTVLSS IPLPSDDFKGTGI KVYDDGQ 
KSVYAVSSNHSAAYNGTDGIiAP VEVEELLRQASERNSKS PTEYH 
EPVYANP PYRPTTPQRET VTPGPNFQERI KI KTITGLG I G VNES I 
HNKGNGLSEERGNNFNHI SPI PPVPHPRS VIQQAEEKLHTPQKR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E~ 
Glutamic Acid, F;= Phenyl alanine, G-Glycine, 
H=Histidine, I^Isoleucine, K=*I>ysine, 
L=Leucine, M=Methionine, N=*Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Veline, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








I>MTPWEESNVMQDKDAPSPKPRLS PRETI FGKSEHQNSSPTCQE 
DEEDVRYNrVHSLPPDINDTEPVTMIFMGYQQAEDSEEDKKFLT 
GYDGIIHAELWIDDEEEEDEGEAEKPSYHPIAPHSQVYQPAKP 
TPLPRKRSEAS PHEKHKS 


5481 


3 


1422 


NS PGSVCLCQCVCPSLLHCLPPLLLLLLL PLLLHES PQP PALRV 
VATSSDRNFMNKHQKPVLTGQRFKTRKRDEKEKFEPTVFRDTLV 
QGLNEAGDDLE AVAKFLDSTGSR LD YRR YADTLFD I LVAGSMLA 
PGGTRIDDGDECTKMTNHCVFSANEDHETIRNYAQVFNKLIRRYK 
YLEKAFEDEMKKLLLFLKAFSETEQTKIAMIiSGILLGNGTLPAT 
ILTSLFTDSLVKEGIAASFAVKLFKAWMAEKDANSVTSSliRKAN 
LDXRLLELFPVNRQSVDHFAKYFTDAGLKEIiSDFLRVQQSLGTR 
KELQKELQERtiSQECP I KEWLYVKEEMKRNDLPETAVIGLUWT 
CIMNAVEWNKKEELVAEQALKHLKQYAPLLAVFSSQGQSELILL 
QKVQE YCYDKTI HFMKAFQKI WliFYKADVLS EEAI LKWYKEAH V 
AKGKSVFLDQMKKFVEWLQNAEEESESEGEEN 




1492 


528 


THWMTGMCYAPHQVLSYINGVTTSKPGVSLVYSKPSRNLSLRL 
EGLQEKDS GP YS C S VNVQDKQGKS RGHS I KTLELNVLVPPAP PS 
CRLQGVPHVG ANVTLS CQSPRS KPAVQYQWDRQLP S FQTFFAPA 
LBVIRGS LShTNhS S S MAGVYVCKAHNEVGTAQCNVTIjEVSTG P 
GAAWAGAVVGTIjVGLGLIjAGLVIiIjYHRRGKALEEPANDIKEDA 
IAPRTLPWPKSSDTISKWGTLSSVTSARALRPPHGPPRPGALTP 
TPSLSSQALPSPRLPTTDGAHPQPISPIPGGVSSSGLSRMGAVP 
VMVPAQSQAGSLV 


S483 


1 


788 


FFFFKGCRAGRGNESDYRKIjEEMHQRFLVSERSKDDLQLRLXRA 
ENRIKQLBTDSSEEISRYQENIQKLQNVLESERENCGLVSEQRIj 
KLQQENKQIiRKETESLRKIALEAQKKAKVKISTMEHEFSlKERG 
FEVQLREMEDSNRNSIVEIjRHLLiATQQKAANRWKEETKKLTESA 
RlRTNNLKSELSRQKLHTQELLSQIiEMANEKVAENEKliILEHQE 
ECANRLQRRLSQAEERAASASQQLSVITVQRRKAASLMNLENI 


5484 


3 


1997 


IMADMEDLFGSDADSEAERKDSDSGSDSDSDQENAASGSNASGS "" 

ESDQDERGDSGQPSNKELFGDDSEDEGASHHSGSDNHSERSDNR 

SEASERSDHEDNDPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSE 

AEGSEKAHSDDEKWGREDKSDQSDDEKrQNSDDEBRAQGSDEDK 

LQNSDDDEKMQNTDDEERPQLSDDERQQLSEEEKANSDDERPVA 

SDNDDEKQNS DDEEQ PQLSDE EKMQNSDDERPQAS DEEHRHS DD 

EEBQDHKSESARGSDSEDEVLRMKRKNAXASDSEADSDTEVPKD 

NSGTMDLFGGADDISSGSDGEDKPPTPGQPVDENGLPQDQQEEE 

PI PETRIEVE I PKVNTDLGNDLYFVKLPNFLS VEPRPFDPQY YE 

DEFEDEEMLDEEGRTRLKLKVENTIRWR1RRDEEGNEIKESNAR 

I VKWS DGSMS LHLGNE VFDVYKAPLQGDHNHLF I RQGTGLQGQ A 

VFKTKLTFRPHS TDSATHRKMTI1SI1ADRCS KTQKI R I kPMAGRD 

PECQRTEMIKKEEERLRAS I RRE SQQRRMR EKQHQRGLSAS YI*E 

PDRYDEEEEGEESISIiAAIKNRYKGGIREERARlYSSDSDEGSE 

EDKAQRLLKAKKLTSDEVRPKLFWSRGLSCTQEPTALNEELTDQ 

AGTN 


5485 


161 


1074 


KRK I LS S MMDS EAHEKRP P ILTS S KQD I S PH ITNVGEM KH YLCG ™" 
CCAAFNNVAIT FP I QKVLFRQQL YG X KTRDAILQLRRDGFRNL Y 
RGILPPLMQKTTTIALMFGIjYEDLSCLLHKHVSAPEFATSGVAA 
VLAGTTEAI FTPLERVQTLLQDHKHHDKFTNTYQAFKALKCHGI 

geyyrglvpilfrnglsnvlffglrgpikehlptatthsahlvn 

DFICGGLLGAMLGFLFFP INWKTRI QSQ IGGEPQS FPKVFQKI 
WLERDRKLINLFRGAHIiNYHRSL ISWGI I NAT YE FL LKVI 


5486 


1404 


142 


I PGSTI S WSPAAARGIjSVCRC CRLHPASAMDLFGDLPEP ERS PR " 
PAAGKEAQKGPLLFDDLPPASSTDSGSGGPtiLFDDLPPASSGDS 
GS LATS I SQMVKTEGKGAKRKTSEEEKNGS EEL VEKKVGKAS S V 
I FGLKGYVAERKGEREEMQDAHV r LNDI TEECRP PSS LITRVS Y 
FAVFDGHGGIRASKFAAQNLHQNLJRKFPKGDVISVEKTVKRCIj 
LDTFKHTDEEFLKQAS SQKPAWKDGSTATCVIiA VDN I Ii YIANLG 
DSRAILCRYKEBSQKHAALSLSKEHNPTQYEERMRIQKAGGNVR 

dgrvlgvlevsrsigdgqykrcgvtsvpdirrcqltpndrfill 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 

amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A==Alanine, CssCysteine, D=Aspartic Acid, B= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=*l»ysine, 

t. T.£aii/~»H nc» M-Mot*Ki nni n<i TvJ— >2i czn^tY'Zirt 7 TIP* 

It-iicUCinB; l v i— l*JtJL.OJ.\JOJ-Iltr / n—J-iz>jJcii. exy J. lie: , 

P=Proline , Q^Glutamine , R=Arginine , 
S=Serine, T=Threonine, V»Valine, 
W-Tryptophan, Y=sTyrosine, X»Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ACDGLFKVFTPEEAVWFILSCLEDEKIQTREGKSAADARYEAAC 
NRLANKAVQRGSADNVTVMVVR I GH 


5487 


535 


182 


AVSLEQIRGLQTPAPVPLPLQPCPSNCDMERVTLALLLIiAGLTA 
LEANDPFANKDDPPYYDWKNLQLSGLICGGLLAIAGIAAVIiSGK 
CKCKSSQKQHSPVPEKAIPLITPGSATTC 


5488 


1072 


259 


AMAASGE PQRQWQEE VAAVVWGS CMTDLVSLTS RL P KTGETIH 
GHKFFIGFGGKGANQCV(2AARLGAMTSMVCKVGKDSFGNDYIEN 
LKQNDISTE FTYQTKDAATGTAS I IVNNEGQNI IVIVAGANLLL 

FNPAPAIADLDPQFYTLSDVFCCWESEAE J LTGLTVGSAADAGE 
AALVLLKRGCQW1 ITLGASGCWIjSQTEP EPKH IPTEKVKAVD 
TTVSFKT 


5489 


81 


893 


GKG P VAAF 1 DQSNI FLTDFXI FI*GQWREE PKMPLLLLGE TEPLK 
LERDCRS PVE PWAAAS PDLALACLCKCQDLSSGAFPNRGVLGGV 
IiFPTVEMVI KVFVATSSGS IAIRKKQQE WGFX.EANKI DFKELD 
IAGDEDNRRWMRENVPGEKKPQNGI PLPPQI FNEEQYCGDFDSF 
FSAKEENI I YSFLGLAPPPDSKGSE KA3EGGETEAQKEGS EDVG 
NliPEAQEKNEEEGETATEETSEIAMEGAEGEAEEEEETAEGEEP 
GEDEDS 


S490 


81 


893 


GKG P VAAF I DQSN I FLTD P KI FLGQWR E E P KM P L LLLGETEP LK 
LERDCRS PVEPWAAASPDLAXACLCHCQDLSSGAFPNRGVLGGV 
LFPTVEMVI KVFVATSSGS IAIRKKQQEWGFLEANKIDFKELD 
IAGDEDNRRWMRENVPGBKXPQNGIPLPPQIFNEEQYCGDFDSF 
FSAKEEJTIIYSFLGLAPPPDSKGSEKAEEGGETEAQKEGSEDVG 
NLPEAQEKNEEEGETATEETEEIAMEGAEGEAEEEEETAEGEEP 
GEDEDS 


5491 


204 


1194 


GSAPRLSLGPTGAQARDPDWWARPPSRPYTQSKEDRPDTEGRSE 
QGDMASS FLPAGAI TGDSGGE LSSGDDS GE VEF P HSP E IEETS C 
IAELFEKAAAHLQGLIQVASREQLLYLYARYKQVKVGNCNTPKP 
SFFDFEGKQKWBAWKALGDSSPSQAMQEYIAVVKKLDPGWNPQI 
P EKKGKE ANTGFGG PVI S S LYHEETIREEDKNI FDYCRENN I DH 
ITKAI KS raWDVNVKDEEGRALLHWACDRGHKELVTVLIjQHRAD 
INCCDNEGQTALHYASACEFLDIVELLLQSGADPTLRDQDGCLP 
EEVTGCKTVSLVLQRHTTGKA 


5492 


3 


1896 


AS KN PIjS AVCTTG I MS S LAVRDPAMDRS LRS VF VGUf I P YEATEE 
QLKDI FSEVGSWS FRXjVYDRETGKPKG YGFCE YQDQETALSAM 
RNLNGREFSGRAIiRVDNAASEKNKEELKS LGPAAP I IDSPYGDP 
IDPEDAPES I TRAVASLPPEQMFELMKQMKLCVQNSHQEARNMIi 
rviin.1 ALiljy/iy V vyiKXFiUif S*i XrtJjlV±JjnJirv.JLnva rjjx erui\.ow 
SVSVSGPGPGPGPGLCPGPWLIiNQQNPPAPQPQELARRPVKDI 
PPLMQTP IQGGI PAPGP I PAAVPGAGPGSI/TFGGAMQPQLGMPG 
VGPVPLERGQVQMSDPRAPIPRGPVTPGGLPPRGLLGDAPNDPR 
GGTLLSVTGEVEPRGYLGPPHQGPPMHHASGHDTRGPSSHEMRG 
GPLGDPRLLIGEPRGPMIDCRGLPMDGRGGRDSRAMETRAMETE 
VLETRVMERRGMETCAMETRGMEARGMDARGLEMRGPVPSSRGP 
MTGGIQGPGPINIGAGGP PQGPRQVPGI SGVGNPGAGMQGTG I Q 
GTGMQGAG I QGGGMQGAG I QGVS I QGGG I QGGG I QGAS KQGGSQ 
PSSFSPGQSQVTPQDQEKAALIMQVLQIiTADQXAMLPPEQRQSI 
LILKEQIQKSTGAS 




1 


1876 


TDTFRVKRPHLRRSASNGHVPGTPVYREKEDMirDEIIELKKSLH 
VQKSDVDI^RTXI^RRLEEE^SRKDRQIEQLLDPSRGTDFVRTLA 
EKRPDASWINGLKQRIIiKLEQQCKEKDGTISKLQTDMKTTNLE 
EMRIAiMETYYEEVHRLQTLLASSETTGKKPLGEKKTGAKRQKXM 
GSALLSLSRSVQELTEENQSLKEDLDRVLSTSPTISKTQGYVEW 
SKPRLLRRI VELEKKLS VMESSKSHAAEPVRSHP PACIASS S AL 
HRQPRGDRNKDHERLRGAVRDLKEERTALQEQLLQRDLEVKQLL 
QAKADLEKELECAREGEEERREREEVI»REEIQTLTSKI*QELQEM 
KKEEKEDCPEVPHKAQELPAPTPSSRHCEQDWPPDSSEEGLPRP 
RSPCSDGRRDAAARVLQAQWKVYICHKXKKAVLDEAAVVLQAAFR 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A^Alanine, C^Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H»=Histidine, I=Isoleucine, K=Lysine, 
^Leucine, M=Methionine, N=*Asparagine, 
PsProline, Q^Glutamine , R=Arginine, 
S=Serine , T=Threonine , V*Val inc , 
W=Tryptophan, Y- Tyrosine, X^Unknowr., *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GHLTRTKLIAS KAHGSEPPS VPGLPDQS SPVPRVPSP I AQATGS ' 
PVQEEAI VI IQSALRAHLARARHSATSKRTTTAASTRRRSASAT 
HGDASSP PFIAALPD PSPSGPQAVAPLPGDDVNSDDSDD I V1AP 
SLPTKNFPV 


5494 


71 


536 


RSKAKIGTPTREVPSTDMXVRRESSSSLTHRPAPSPATPRLLGT" 
RRVLLGVS EGTGCADAMELVLVFLCS LLAFMVIiASAAE KE KEKD 
PFHYDYQTLRIGGLVFAWLFSVGILLILSRRCKCSFNQKPRAP 
GDE E AQVENL I TAN ATE PQKAEN 


5495 


273 


2168 


DS LLL I QVDTM PFTLHIiRSRL PS Al R SL I LQ KKPN I RN TS S MAG 
ELRPASLWLPRSliAPAFERFCQVNTGPliPLLGQSEPEKWMLPP 
QGAI S ErRMGHPQFWK YEFGACTGSIiASLEQ YS EQIiKDM VAFFL 
GCS FS LEEALEKAGLPRRDPAGHS QAGAYKTT VPCVTHAGFCCP 
LWTMRPlPKDKIiEGLVRACCSIiGGEQCSQPVHMGDPELIiGI KEL 
SKPAYGDAMVCPPGEVPVFWPSPLTSLGAVSSCETPLAFAS I PG 
CTVMTDLKDAKAPPGCLTPERIPEVHHISQDPLHYSIASVSASQ 
KIRELESMIGIDPGNRGIGHLLCKDELLKASLSLSHARSVLITT 
GFPTHFNHEPPEETDGPPGAVAI*VAFLQAI*EKEVAXIVDQRAWN 
LKQKI VEDAVEOGVLKTQ I P ILTYQGGSVEAAQAFLCKNGD PQT 
PRFDHLVAIERAGRAADGNYYNARKMNIKHX.VDP IDDLFLAAKK 
IPG I SSTGVGDGGNEIiGMGKVKEAVRRHIRHGDVIACDVEADFA 
VIAGVSNWGG YAIiACAL YII*YS CAVHSQYLRKAVGPS RAPGDQA 
WTQALPS VIKBEKMLGI LVQHKVRSGVSG I VGMEVDGLPFHNTH 
AEM I QKL VDVTTAQV 


5496 


3 


2408 


QDTKMHE I YKGNI TPQLNKNTLKTSAATDVWAVYFSQFWI DY3G 
MKSGKGRP ISF VDS FPLS I W ICQPTRYAESQKEPQTCNQVS LNT 
SQSBSSDLAGRIiKRKiaLKEYYSTESEPLTWGGQKPSSSDTFFR 
FSPSSSEADIHIiLVHVHKHVSMQINHYQYLLLLFLHESLILLSB 
NLRKDVEAVTGSPASQTS I CIGILLRSABLALLLHP VDQANTLK 
SPVSESVSPWPDYLPTBNGDFLSSKRXQISRDINRIRSVTVNH 
MSDKRSMSVDLS HI PLKDPLLFKSASDTNLQKGI S FMDYLSDKH 
LGKISEDESSGLVYKSGSGEIGSETSDKKDSFYTDSSSVLNYR3 
DSNI hS FDSDGNQWI LS S TL TS JGGNE TI ES XFKAEDIiIjPE AASIi 
SENLDISKEETPPVRTLKSQSSLSGKPKERCPPNIAPLCVSYXN 
MKRSSSQMSIiDTISLDSMILEEQLLESDGSDSHMFLEKGNKKNS 
TTNYRGTAES VNAGANLQtfYGETS PGAI STNSEGAQENHDDLMS 
VWFKITGVNGE ID IRGEDTEICLQVNQVTPDQLGM I SLRHYLC 
NRPVGSDQKAVIHS KSS PE ISLRFESGPGAVIHSLLAEKNGFLQ 
CHI KKFSTEFLTSSLMN IQHFLEDETVATVMPMKI QVS NTKINJL 
KDI^PRSSTVSLEPAPVTVHIDHLVVERSDDGSFHIRDSHML1!3T 
GNDLKENVKS DS VLLTSGKYDL KKQRS VTQATQTS PGVP WPS QS 
ANFPEFSFDFTREQLMEENESIiKQELAKAKMAIAEAHLEKDAIiL 
HHIKKMTVE 


S497 


1821 


3308 


S ISKLLKRRSNIDAYLLSNSCAFFAPRIiFSLASQI IREQQSPNV 
CFrYKYSGFPSLECQCHFVSPHSSCYIWFFSFPPPFFVCFQLSM 
GFSHYSLSSESHVGPTGAGLFPHCLPASRLLPRVTSVHLPDYAH 
YYTIGPGMFPSSQIPSWKDWAKPGPYDQPLVNTLQRRKEKREPD 
PNGGGPTTASGPPAAAEEAQRPRSMTVSAATRPGEEMEACEEIiA 
liALSRGIiQUDTQRSSRDSIiQCSSGYSTQTTTPCCSEDTIPSQVS 
DYDYFSVSGDQEADQQEFDKSSTI PRNSD ISQS YRRMFQAKRPA 
SrAGLPTTLGPAMVTPGVATIRRTPSTKPSVRRGTIGAGPlPIK 
TP VI PVKTPTVPDLPGVLPAPPDGPEERGEHSPES PSVGEGPQG 
VTSMPSSMWSGQASVNPPLPGPKPSIPEEHRQAIPESEAEDQER 
EPPSATVSPGQIPESDPADLSPRDTPQGEDMIiNAlRRGVKLKKT 
TTNDRSAPRFS 


5498 


2434 


1492 


ILTHQEIFTGEKPCECGKASIQMSHLSQQKIYSGEKPFACKVCG 
KVFSHKSNLTEHEHFHTREKP FECNECGKAFSQKQYVI KHQNTH 
TGEKLFECNECGKSFSQKENLLTHQKIHTGEKPFECKDCGKAFI 
QKSNLIRHQRTHTGEKPFVCKECGKTFSGKSNLTEHEKIHIGEK 
PFKCSE03TAFGQKKYL IKHQNIHTGEKP YECNEGGKAFSQRTS 
LIVHVRIHSGDKPYECNVCGKAFSQSSSLTVHVRSHTGEKPYGC 
NECGKAFSQFSTLAIjHIiRIHTGKKPYOCSECGKAFSQKSHHIRH 
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Amino acid QPCiniPT*! t~ rnht'Si rcinn eirrnial yn H -if 

(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I^lsoleucine, K« Lysine, 
L=Leucine , M=Methionine , N-Asparagine , 
P»Proline, Q=Glutamine, R=Arginine, 
S^Serine, T» Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine f X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QKIHTH 


5499 


324 


" 1 926 


GFGQIGRGHKI TTYPFS PRKSGRKGMAQSQGWVKRY I KAFCKGF 
FVAVPVAVTFLDRVACVARVEGASMQPSLNPGGSQSSDVVLLNH 
WKVRWFEVHRGDI VSLVS PKNPBQKI IKRVIALEGDI VRTIGHK 
NRYVKVPRGH I WVEGDHHGHSFDSNS FGPVSLGLLHAHATHILW 
PPERWQKLESVLPPERLPVQREEE 


5500 


1978 


12 8S 


KPDWRLQNLFPRLYLWRSSRFGFGHLKKRLQMDFKIEHTWDGFP 
VKHEPVF I RLNPGDRG VMMD ISAPFFRDPPAPLGEPGKP FNELW 
DYEWEAFFLND I TEQYLEVELCPHGQHLVLLLS GRRNVWKQEL 
PLSFRVSRGETKWEGKAYIiPWSYFPPNVTKFNSFAIHGSKDKRS 
YEALYPVPQHBLOQGQKPDFHCLEYFKSFNFNTLr»GEEWfCQPSS 
DLWLIEKCDI 


5501 


2927 


2226 


C RP P VS ARVAPGHQGAVGGS GRRPARVE WDAAAR PSS RP FS LP 
AAIMLALISRLLDWFRSLFWKEEMELTLVGLQYSGKTTFVNVIA 
SGOFSEDMI PTVGFNMRfCVTXnNVTT V T wnTrtrtrtDOTTO qmwpis v 

CRGVNAIVYMIDAADREKlEASRNEIiHNLLDKPQLOGIPVLVLG 
NKRDLPNALDEKQLI EKMNLSAIQDRE I CC YS IS CKEKDNID I T 
LQWLIQHSKSRRS 


5502 


3 — — 


824 


GKFFKGGGS S KSRAAPS PQ EALVRLRETEEMLG KKQE YLENR I Q 
REI AIAKKHGTQNKRAALQALKRKKRFE KQLTQ I DGTLST IEFQ 
REALEWSHTNTEVLRNMGFAAKAMKS VHENMDLNKIDDLMQE IT 
EQQDIAQE I SEAFSQRVG FGDD FDEDELMAELE ELE QEELNKKM 
TKIRLPKVPSSSLPAQPNRKPGMSSTARRSRAASSQRAEEEDDD 
IKQLAAWAT 


5503 


216 . 


654 


KGVRRRGRVRSDSEDSHLG Y FKMS FLLP KLTSKKEVDQA1KSTA 
EKVLVLRFGRDEDPVCLQLDDILSKTSSDLSKMAAIYLVDVDQT 
AVYTQYFDISYIPSTVFFFNGQHMKVDYGGEDPALRSIKAVRRT 
SPAGTLGEKPVNS 


S504 


58 


3563 


QLSFSFQAPVTFDDITVYLLQEEWVLbSQQQKELCGSNKLVAPL 

GPTVANPEIiFRKFGRGPEPWLGSVOGQRSLLEHHPGKKQMGYMG 

EMBVQGPTRESGQSLPPQKKAYLSHIiSTGSGHIEGDWAGRNRKL 

LKPRSIQKSWFVQFPWLIMNEEQTALFCSACREYPSXRDKRSRL 

IEG YTGP FKVETLKYHAKS KAHMFCVNALAARDPI WAARFRS IR 

DPPGDVLAS PEPLFTADCP I FY PPGPLGGFDSMAELLPSSRAEL 

EDFGGLX3AI PAM YLJDCI SDLRQKEITDQ XHSSSD IN I LYNDAVE 

SGIQDPSAEGLSEEVPVVFEELPWFEDVAVYFTREEWGMLDKR 

QKELYRDVMRMNYELLASLGPAAAKPDLISKLERRAAPWIKDPN 

GPKWGKGRPPGNKKKVAVREADTQASAADSALLPGSPVEARASC 

CSSSICEEGDGPRRIKRTYRPRSIQRSWFGQFPWLVIDPKETKL 

FCSACI ERP WLHDD KS SRL VRG YTG PFKVETLKYHRVS KAHR1 CV 

NTVEIKEDTPHTALVPE I SSDLMANMEHFFNAAYS IAYHSRPLN 

DFEKILQLLQSTGTVILGKYRNRTACTQFIKYISETLKREILED 

VRNSPCVS VLLDSSTDAS EQACVGI YIRYFKQMEVKES Y1TLAP 

LYSETADGYFETIVSALDELDIPFRKPGWWGLGTDGSAMLSCR 

GGLVEKFQEVIPQLLPVHCVAHRLHLAVVDACGSIDLVKKCDRH 

IRTVFKFYOSS^RLKELQEGAAPLEaEIIRLKDLNAVRWVASR 

RRTLHALLVS W PALARHLQR VAE AGGQ I GHRAKGMLKLMRGFH F 

VKFCHFLLDFLSIYRPLSEVCQKEIVLITEVNATLGRAYVALES 

LRHQAGPKEEEFNASFKDGRLHGICLDKLEVAEQRFQADRBRTV 

LTG IE YLQQRFDADRP PQLKNMEVFDTMAWPSGIELAS FGNDDI 

LNLARYFBCSLPTGYSEEALLEEWLGLKTIAQHLPFSMljCKNAL 

AQHCRFPLLSKU^VVVCVPISTSCCERGFKAMNRIRTDERTKL 

SHEVLNMLMMTAVNGVAVTEYDPQPAIQHWYLTSSGRRFSHVYT 

CAQVPARSPASARLRKEEMGALYVEEPRTQKPP ILPSREAAEVL 

KDCIMEPPERLLYPHTSQEAPGMS 


5505 


3312 


1219 


NCS PRSLSAAKIflSNRNNNKLPSNLPQLQNL I KRDPPAY I EEFLQ " 
QYNHYKSNVEIFKLQPNKPSKELAELVMFMAQISHCYPEYLSNF 
POEVKDLLSCNHTVLDPDLRMTFCKALILLRNKNLINPSSLLEL 
FFELFRCHDKI^RKTLYTHIVTDIKNINAKHKtWiamvVLQNFM 
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to first 
amino acid 
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amino acid 
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Predicted end 
nucleotide 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D^Aspartic Acid, E=» 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








YTMIiRDSNATAAKMSLDVMIELYRRNIWNDAKTVNVlTTACFSK 

KKSS KNKKKLEKAMKVIjKKHRKKKKPEVFNFSAIHLIHD pqdfa 
EKLLKQLECCKERFEVKMMLMNLI SRLVG IHELFLFNFYPFLQR 
FLQPHQREVTKILLFAAQASHHLVPPE I IQSIjLMTVANWFVTDK 
NSGEV>lTVGINAIKEITARCPIiAMTEELLQDLAQYKTHKDKl?rVM 
MS ARTLI HL FRTLNPQMLQKKFRGKPTEAS I BARVQEYGELDAK 
DYIPGAEVLEVEKEENAENDEDGWESTSLSKEEDADGEWIDVQH 
SSDEEQQEISKKLNSMPMEERKAKAAAIST9RVLTQBDFQKIRM 
AQMR KEIjDAAPG KSQKRKYIE IDSDEEPRGELLSLRD IERLHKK 
PKSDKETRLATAMAGKTDRKEFVRKKTKTNPFSSSTNKEKKKQK 
NFMMMRYSQNVRSKNKRSFREKQLAIiRDALLKKKKRMK 


5506 


l 


1531 


FHGDLCGQRGGS APGEGGSS AWPAPAHPLPERERER EALCPGR S 
Lour^iVjrCiJ!. I VQ» I 1 PVWo PL»hW3GDEEr^PNPYVRFPYRWWAVVv 
LAAFPSLGAGGETPEAPPESMTQLWFFRFVVNAAGYASFMVPGY 
LLVQYFRRKNYLETGRGLCFPLVKACVFGNEPKASDEVPIiAPRT 
EAAETTPMWQALKLIjFCATGIiQVSYLTWGVLQERVMTRSYGATA 
TS PGER FTDS QFluVLMNR VL AL I VAG JjSCVLCKQ PRHGAPMYR Y 
S FAS L SNVLS S WCQYEALKF VS FPTQ VLAKAS KVT PVMLMG KLV 
SRRS YEHWEYbTATLI S IGVSMFLLSSGPE PRS SPATTLSGLIL 
LAGYlAFDSFTSNWQDAlJi'AYKMSSVOMMFGVNFFSCLFrVGSI/ 
LEQGALLEGTRFMGRHSEFAAHALLLSICSACGQLFIFYTIGQF 
GAAVFTI I MTLRQAFAI LLS CLLYGHTVTWGGLGVAWFAALL 
LRVYARGRLKQRGKKAVPVES PVQKV 


5S07 


3704 


1271 


PRGTRRCRPAGRASRRARRRPPCPGPAAPGSLE IGGFGTAAGKK 
VAVAD VQ PGPMRFHQDQLQ VLTiVFTKEDNQ CNGFCRACE KAGFK 
CTVTKE AQAVIiAC FLDKHHD 1 1 I IDHRNPRQLDA2ALCRS IRSS 
KJ -S ENTVI VGWRRVDREELSVMPFISAGFTRRYVENPNIMACY 
NELLQLEFGEVRSQLKLRACNSVFTALENSEDAI3ITSEDRFIQ 
YANPAFETTMG YQSGELIGKELGEVP INEKKADLLDTINSC IRI 
GKEWQG T YYAKKKNGDNIQQNVKI I PVIGQGGKIRHYVS I IRVC 
NGNNKAEKI SE CVQSDTHTDNQTGKH KDRRKGS LDVKAVASRAT 
E VSSQRRHS SMARIHSMT I EAP I TKV I N 1 1 NAAQESSPMPVTEA 
ZiDRVLE I LRTTELYSPQFGAKDDDPHANDLVGGLMSDGIjRRLSG 
NEYVLSTKNTQMVSSNI ITP 1SLDDVPPRI ARAMENEEYWDFD I 
FELE AATHNRPLI YLGLKMFARFGICEFUICSE S TIiRS WLQI I E 

IAATIHDVDHPGRTNSFLCNAGSELAILYNDTAVLESHHAALAF 
QLTTGDDKCNIFKNMERNDYRTLRQGI IDMVTjATEMTKHFEHVN 
KFVNSXNKPLATLEENGETDKNQEVINTMLRTPENRTIilKRMLI 
KCADVSNPCRPLQYCIEWAARISEEYFSQTDEEKQQGLPVVMPV 

KYWKGLDEMKLRNLRPPPE 


5508 


1151 




LSSVFSRRSASMFAVGCSMGPFLHYWYLSLDRLFPASGLRGFPN 
VLKKVLVDQLVASPLLGVWYFJ^IjGCLEGQIVGESCQELREKFW 
EFYKADWCVWPAAQFVNFLFVPPQFRVTYINGLTLGWDrYLSYL 
KYRSP VP LTPPG C VALDTRAD 


5509 


1238 


619 


rksrgcqnai^asgpaaaaaaimvrklkfheqkllkqvdflnWe ~ 
vt dhnlh e lr vl rr yrlqrr ed ytr ynqls ravrelarrlr d l p 
erdqfrvrasaalldklyalglvptrgslelcojfvtassfcrrr 

LPT VLIiKLRMAQHLQAAVAP VEQGHVRVG PDVVTDPAFTjVTR SM 
EDPVTWVDSSKIKRHVLEYNEERDDFDIjEA 


5510 


96 


1195 


PAGAHIiSSGSSEPLVEPGRGRVGARVKGERGLQASGSAPGRSJ^ 
AEGERQPPPDSSEEAPPATQNFIIPKKEIHTVPDMGKWKRSQAY 

adyigf i krlnegvkgktcltfeyrvseaiekiivallnrldrw id 
etppvdqpsrfgnkayrtwyakldeeaenlvatvvpthlaaavp 

E VAVYLKES VGNS TR I DYGTGHEAAFAAFLCCLCKIG VLRVDDQ 
lArvFKVFNRYIiEVMRKTjQKTYRMEPAGSQGVWGLDDFQFLPFI 
WGSSQLIDHPYLEPRHFVDEKAVNENHKDYMFLEC1LFITEMKT 
GPFAEHSNQtiWNISAVPSWSKVNQGLIRMYKAECLEKFPVIQHF 
KFGSIA*PIHPVTSG 
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Amino acid segment containing signal peptide 
(A-Alanine, C^Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S= Serine, T=Threonine f V-Valine, 
W= Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5511 


276 


1980 


KLSRVLNLPPENLITSISAVPISQKEEVADFQLSVDSLLEKDND 
HSRPDIQVQAKRIAEKLRCDTWSEISTGQRTVNFKINRELLTK 
TVLQQVI EDGS KYGLKS EL F SGL PQKKI WE PS S PNVAKKFHVG 
HLRST I IGNFIANLKEALGHQVIRINYLGDWGMQFGLLGTGFQL 
FGYBEKLQSNPLQHLFEVYVQVNKBAADDKSV7UCAAQEPFQRLE 
LGDVQALS LWQKFRDLS I EEY IRVYKRLGVYFDEySGESFYREK 
SQEVLKLhESKGLhLKTXKGTAWDLSGmDPSBJCTVMRSDGT 
SLYATRDLAAAlDRMDKYKFDTMiyVTDKGQKKHFQQVFQMLKl 
MGYDWAERCQHVPFGWO/3MKTRRGDVTFLEDVLNEIQIJ5MLQN 
MAS I KTTKE LKNPQETAER VGLAAL I IQDFKGLLL3 D YKFS WDR 
VFQSRG DTG VFLQYTHARLHS LEETFGCG YLNDFNTAC LQE PQS 
VS ILQHLLR FDEVItYKSSQDFQPRHI VS YXjLTLSHLAAVAHKTL 
QIKDSPPEVAGARLHLFKAVRSVLANGMKLLGITPVCRM 


5512 


120 


1015 


DPSIiliLTI TVTGVTVLVL VLKSMWSRRREP ITLQDPEAKYPLPL 
IE KEKISHNTRRFRFGIiPSPDHVLGLPVGNYVQLLAKI DNELW 
RAYTPVSSDDDRGFVDLIIKIYFKNVHPQYPEGGKMTQYLENMK 

GGTG ITPMLQLI RK I TKD PSDRTRMSLI FANQTEEDI LVRKELE 
E IARTHPDQFDLWYTfcDR PP IGWK YSSGF VTADMIKEHLPPPAK 
STLILVCGPPPLIQTAAHPNLEKLGYTQDMI FTY 


5513 


2 


837 


AR WRL PS DS PR I PPAG AE TPG RGS CRNYJjPS SS PPP P E PS S FP S 
PPTSRGGPGSRDTMSDSEEESQDRQLKIWLGDGASGKTSIjTTC 
FAQETFGKQY KQTIGLDFFtiRRI TLPGNLNVTLQ I WD IGGQTIG 
GKMLDKYI YGAQGVLLVYDITKYQS FENLBDWYTWKKVS EESE 
TQPLVALVGNKIDLEHMRTIKPEKHLRFCQENGFSSHFVSAKTG 
DSVFLCFQKVAAEILGIiCLNKABIEQSQRVVKADrVNYWQEPMS 
RTVNPPRSSMCAVQ 


5514 


1235 


449 


VWRPSWIMGNFRGHALPGTFFFI IGLWWCTKS ILKYICKKQKRT 
CYLGSKTLF YRLE I LEG I T I VGMALTGMAGE QF I PGGPHLMIjYD 
YKCGHWNQLLGl-mHF'mYFFPGLLGVADILCFTISSLPVSLTKl, 
MLSNAL FVEAF I FYNHTHGREMLD I FVHQLLVL WFLTGIiVAFL 
EFXVRNNVLLELIiRSSLILLQGSWFFQIGFVLYPPSGGPAWDIiM 
DHENILFLTICFCWHYAVTIVIVGMNYAFIXWLVKSRLKRLCSS 
EVGLLKNAEREQESEEEM 


5515 


1572 


260 


FVRLVGRGDCDPLLSVC1.TTMPLYEGLGSGGEKTAWIDLGEAF 
TKCGFAGETGPRCIIPSVIKRAGMPKPVRWQYNINTEELYSYL 
KEF I H I LYFRHLLVNPRDRRWT IBS VLCPSHFRE TLTRVLFKY 
FEVPSVLLAPSHLMALliTLGINSAMVLDCGYRESLVLPIYEGXP 
VLNCWGALPLGGKALHKELETQIiLEQCrrVDTSVAKEQSLPSVMG 
5 VPEG VLED IKARTCF VSDLKRGLKIOAAfCFNIDGjMNERPS PP P 
KVD Y PLDGEKI LH I LGS IRDS WE I LFEQDNEEQSVATLILDSL 
IQCP IDTRKQ LAEN LWI GGTS MLPGFLHRLI*AE I R YLVE KP KY 
KKALGTKTFR 1HTP PAKANCVAWLGGAI FGALQDI LGSRS VS KE 
YYNQTGRIPDWCSLNNPPLEMMFDVGKTQPPTJMKRAFSTEK 


5516 


3 


735 


NSREPPOAGPGPSPRKSPTASSFZiFpWKPLASSFWMGAQGAOES 
1 KAMWRVPGTTRRPVTGES PGMHRPEAMIiLLLTIALLGGFTWAG 
KMYG PGGGKYFS TTED YDHE I TGLRVS VGLLL VKS VQ VKLGDS W 
D VKLGALGGNTQE VTLQFGE Y I TKVF VAFQAFLRGMVMYTS KDR 
Y F YFGKLDGQIS SAY P SQEGQVI*VG I YGQYQLLG I KS I GFE WNY 
PLEEPTTEPPVNLTYSANSPVGR * 


5517 


246 


499 


SE I YVAMRTDSS KMTDVESG VANFASSARAGRRNALPD IQSSAA " 
TDGTS DL PLKLEALS VKEBAKEKDEKTTQDQL EKPQNE EK 


5518 


3 


1375 


DAWADAW VRAWDIJiMDFPCLWLGLLLPLVAALDFNYHRQEGMEA 
FL KTVAQNYSS VTHLHS I GKS VXGRNLWVLWGR FPKEHR1G I P 
EFKYVANMHGDETVGRSLLLHIilDYLVTSDGKDPElTNLINSTR 
IHIMPSMNPDGFEAVKKPDCYYS I GRENYNQYDLNRNPPDAFE Y 
l^SRQPElVAVMKWLKIOTFVIiSAMriKGGALVASYPFDWGVQA 
TGAL YSR S LTPD DDVFQY LAHT YAS RNPNMKKGDE CKNKT*ETFPN 
GVTNGYSWYPLQGGMQDYNYIWAQCFEITLELSCCKYPREEKLP 
SFl*NNNKASLIHTYIKQVHLGVKGQVFDQNGMPLPNVIVEVQDRK 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P= Proline, Q=Glut amine, R^Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








H I CPYRINKYGE YYLLLLPGS Y I INVTVPGHDPHITKVI I PEKS 
QN FSALKKD I LLP FQGQLDS I PVSN PS CPM I PL YRNLP DHS AAT 
KPSLFL FLVS LLHIFFK 


5519 


87 


477 


I KS KLKQQVE VQE SEWRLTEAKGPTMGKESG WDSGRAAVAAWG 
GWAVGTVLVALS AMGFTS VGIAASS I AAKMMS TAAI ANGGGVA 
AGSLVAI LQS VGAAGLS VTSKVI GGFAGTALGAWLGS P PSS 


5520 


117 


943 


PTEGRQKVLKTFTVPRSALAMTKTSTCIYHFLVLSWYTFLNYYI 
SQEGKDE VKP KILANGARWKYMTLLNLLLQT I F YG VTCLDD VLK 
RTKGGKDIKFLTAFRDLLFTTLAFPVSTFVFLAFMILFLYNRDL 
I YPKVLDTVI PVWLNHAMHTF IFPI TLAEVVLRPHS YPSKKTGL 
TLLAAAS IAYI SRILWLYFETGTWVYPVFAXLSLLGLAAFFSLS 
YVF I ASI YLLG EKLNHW KW VSVQ I LQRWRLESVGI CFQWPDWKS 
PAKHQLVKNIR 


5521 


546 


911 


KILNMQKSCEENEGKPQNMPKAEEDRPLEDVPQEAEGNPQPSEE 
GVSQEAEGNPRGGPNQPGQGFKEDTPVRHLDPEEMIRGVDELER 
LREEIRRVRNKFVMMHWKQRHSRSRPYPVCFRP 


5522 


1224 


637 


GSRPliGQRSREK^VFGYGSLIWKVDFPYQDKLVGYITNYSR^F 
WQGSTDHRGVPGKPGRVVTLVEDPAGCVWGVAYRLPVGKEEEVK 
A YLDFREKGG YRTTTVI F YPKDP TTKPFS VLL YI GTCDNPD YLG 
PAPLEDIAEQ I FNAAGPSGRNTEYLFELANS IRNLVPEKADEHL 
FALEKLVKERLEGKQNLNCI 


5523 


3 


1280 


S XGKKRMGS S MS AATARRPVFDDKED VNFDH FQ I LRA1GKGSFG 
KVC I VQKRDTEKM YAM ECYMNKQQC IERDEVRNV FRELE I LQEI E 
HVFLVNLWYSFQDEEDMFMWDLLLGGDLRYHLQQNVQFSEDrV 
RLYI CEMALALDYLRGQHI I HRDVKPDNILLDERGHAHLTDFNI 
ATI I JCDGERATALSGTKPYMAPE I FHS FVNGGTG YSFEVEWWSV 
GVMAYELLRGWRPYDIHSSNAVESLVQLFSTVSVQYVPTWSKEiyi 
VALLRKLLTVNPEHRLSSLQDVQAAPAtiAGVLWDHLSEiCRVEPG 
FVPNKGRLHCDPTFELEEMILESRPLHKKKKRIAKNKSRDNSRD 
S SQSENDYLQDCLDAIQQDF VI FNREKLKRSQDLPREPLPAPES 
RDAAE PVEDEAERSALPMCG PICPSAGSG 


5524 


85 


2318 


RERERDHR PGESS OGQS GAGGCF PS P TMELRCGGLLFSSRFD S G 
NLAHVEKVESLSSDGEGVGGGASALTSGIASSPDYEFNVWTRPD 
CAETE FENGNRSWFYFSVRGGMPGKLI KINIMNMNKQSKLYSQG 
MAPFVRTLPTRPRWERIRDRPTFEMTETQFVLSFVHRFVEGRGA 
TTFFAFCYPFSYSDCQELLNQLDQRFPENHPTHSSPLDTIYYHR 
ELLC YSLDGIiRVDLLTI TS CHGLREDREPRLEQLFPDTSTPR PF 
RFAGKR I FFLSS RVHPGETP S S FVFNG FLDFI LR PDDPRAQTLR 
RLF VFKLI P MLN PDGVVRGH YRTDSRGVNLNRQ YLKPDAVLHPA 
I YGAKAVLLYHHVHSRLNSQ S SSEHQ P SSCLPPDA? VSDLE KAN 
NIiQNEAQ CGHS ADRHNAEAWKQTEPAEQKIjNS VW IMPQQS AGLE 
ESAPDTI PPKE SGVAY YVDLHGHAS KRG CFMYGNS FSDESTQVE 
NMLYPKLISLNSAHFDFQGCNFSEKNMYARDRRDGQSKEGSGRV 
AI YKASGIIHS YTLECNYNTGRSVNS I PAACHDNGRASPPPPPA 
FPSR YTVELFEQVGRAMAI AALDMAE CNPWPR I VLS EHS SLTNL 
RAWML KHVRNSRGLSSTLNVG VNKKRGLRTPPKSHNGL PVSCS E 
NTLSRARSFSTGTS AGGS S S SQQNS P QMKNSP S FP FHGSRP AGL 
PGhGSS TQKVTHR VLG PVRGKP VWEP LQHVFGCLGHC WG K 


5525 


105 


834 


SNTLDFERHLFIMGQQISDQTQLVINKLPEKVAKHVTLVRESGS 
LTYEE FI^RVAELNDVTAKVASGQEKHLL FEVQ PGSDS S AFWKV 
WRWCTKINKS SG I VEAS R I MNLYQ F IQL YKD 3 TSQAAGVLAQ 
S STS EE P DENSSS VTSCQAS LWMGRVKQLTDEEECC I CMDGRAD 
L I LP CAHS FCQKC I DKWSDRHRNCPI CRLQMTGANES WWSDAP 
TEDDMAN Y I LNMADEAGQPHRP 


5526 


3 


853 


RRPCWP VRAAKRTGAAARA PRGLE VTMLR VAWRTLS LI RTRAVT 
QVLVPGLPGGGSAKFPFNQWGLQPRSLLU3AARGYVVRKPAQSR 
XX)DDPPPSTLLKBYQNVPGIEK\7DDWKRLLSIiEMANKKEMLKI 
KQEQFMKKIVANPEDTRSLEARIIALSVKIRSYEBHLEKHRKDK 
AHKRYLLMSIDQRKKMLKNLRNTKYDVFEKIC WGLG IEYTFP PL 
YYRRAHRRFVTKKALCIRVFQETQKLKKRRRALKAAAAAQKQAK 
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Predi cted end 
nucleotide 
location 
corresponding 
Co first 
amino acid 
residue of 
amino acid 
sequence 


(A=Alanine, C=Cysteine, D=>Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K-I*ysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glut amine, R^Arginine, 
S=Serine , T=Threonine , V=Val ine , 
W^Tryptophan, Y=*Tyrosine, X= On known, *=Stop 
Codon / /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








RRNPDS PAKAI PKTLKDSQ 


5527 


3225 


565 


LLRKYLLHQWPLLLRHQPNRTCI SFSATMKLKDTKS RP KQSS CG 
KFQTKGI KWG KWKE VK I DPNMFADGQMDDLVCFEELTD YQLVS 
PAKNPSSLFSKEAPKRKAQAVSEEEEEEEGKSSSPKKKIKLKKS 
KNVATEGTSTQKEFEVKDPELEAQGDDMVCDDPEAGEMTSENLV 
QTAPKKKKNKGFOCGLEPSQSTAAKVPKKAKTW I PEVHDQKADVS 
AWKDLFVPRPVLRALSFLGFSAPTP IQALTLAPAIRDKLDILGA 
AETGSGKTLAFAI PM IHA VLQWQ KRNAAPP PSNTEAP PGETRTE 
AGAETRS PGKAEAESDALPDDTV I ESEALPSD IAAEARAKTGGT 
VSDQALL FGDDDAGEGPSSLIRE KPVPKQNEWEEENLDKEQTGN 
LKQE LDD KS ATCKAY PKRPLLGLVLTPTRB LAVQVKQH IDAVAR 

YHLRNIiRQLRCLWDEADRMVEKGHFAELSQLLEMLNDSQYtQPK 
RQTLVFSATLTLVHQAPARILHKKHTKKMDKTAKLDtiLMQKIGM 
RGKP KVI DLTRNEATVETLTETK IH CETDEKDF YL Y Y FLMQ YPG 
RSIiVFANSISCIKRbSGLLKVLDIMPLTLHACMHQKQRLRNLEQ 
FARLEDC VLLATDVAARGLDI PKVQHVIHYQVPRTS E I YVHRSG 
RTARATMEGLSLMLIGPEDVINFKKIYKTLKKDEDIPLFPVQTK 

EDMYKGGKADQQEERRRQKQMKVLKKELRHLLSQPLFTESQKTK 
YPTQSGKPPLLVSAPSKSESALSCXiSKQKKKKTKKPKBPQPEQP 
QPSTSAN 


5528 


3 


895 


GPFLSACRMWGACKVKVHDSLATIS ITLRRYLRLGATMAKSKFE 

QLMTKCAQTVMEELEDIVIAYGQSrEYSFVFKRKTNWFKRRASK 
FMTHVASQFASSYVFYWRDYFEDQPLLYPPGFDGRVWYPSNQT 
LKDYLS WRQADCHINNL YNTVFWAL I QQSGLTP VQ AQGRLQGTL 
AADKNEILFSEFK1NYNIIEPPMYRKGTVLIWQKVDEVMTKEIKI* 
PTEMEGKKMAVTRTRTKPCKPSHLPRAPCLRWL 


5529 


46 


640 


TFRLVSAHLKTRXIjINPEAAERRWRDWDSRQGWLSVKMQRVSGL 
LSW TLS R VLWLSGLSEPGAARQPR IMS EXALE VYDLI RT I RDPE 
KPNTIiEELEWSE SCVE VQE INEE E YL VI I R FTP TVPHCS LATL 
IGLCLRVKliQRCLPFKHKLE I YISEGTHSTEED IKKQ INDKERV 
AAAMENPNLREIVEQCVLEPD 


5530 


4541 


2606 


AQIVHAISYCHKliHVGHRDLiKPElWVFFEKQGljVKIjTDFGFSNK 
FQPGKKIjTTS CGSLAYSAPE I LLGDEYDAPAVD I WSLGVI LFML 
VCGQPPFQEA3^DSETTjTMIMDCKYTVPSHVSKECKDLITRMI»C3R 
DPKRRAStiEEIENHPWLQGVDPSPATKYWIPLVSYKNLSEEEHN 
S 1 1 QRMVLGD I ADRDAI VEALETNRYKHITAT YFLLAERI LR.EK 
QEKEXQTRSAS PSNIKAQFRQS WPTKIDVPQDLEDDLTATPLS H 
ATVPQS PARAADSVLNGHRSKGLCDSAKKDDLPELAGPAIjSTVP 

paslkptasgrkclfrveedeeedeedkkpmslstqwlrrkps 
vtnrhtsrksapvwq i feegesddefdmdenlppklsrlkm3st1 
aspgtvhkryhrrksqgrgsscsssetsdddsesrrrldkdsgf 
tyswhrrdsseg p pgsegdgggqs kpsnasggvdkas psennag 
ggspssgsggnptntsgttrrcagpswsmqlasrsagelveslk 
lmslclgsqlhgstkyiidpqnglsfssvkvqekstwkmcisst 
gnagqvpavgg i kf fsdhmadttte leri ks kwlknnvlq iiplc 
ekti s vnxqrnpkegllcas spascchvt 


5531 


24 


515 


GSQPRAPRPRDSMERPEPELIRQSWRAVSRSPLEHGrVIiFARLF 
ALEPDLLPtiFQYNCRQFSSPEDCr.SSPEFLDHIRKVMLVIDAAV 
TtWEDLS SLEE YLAS LGRKHRAVGVKLS S FSTVGES LLYMLEKC 
LGPAFTPATRAAWSQLYGAWQAMSRGWDGE 


' 5532 


3395 


1402 


SDWMVVGKRKM 1 1 EDETEFCGEELLHSVLQCKS VFDVLDGEEMR 
RARTRANPYEKIRGVFFLNRAAMKI^LAl^DFVFDRMFTNPRDSYG 
KPIiVKDREAELL YFAD VCAGPGGFS E YVIiWRKKWHAKG FGMTIiK 
GPNDFKLEDFYSASSELFEPYYGEGGIDGDGDITRPENISAFRN 
FVI£INTDRKGVHFLMAIX3GFS\^:GQENLQEILSKQLLLCQFIjMA 
LS IVRTGGHFICKTFDLFTP FS VGLVYLL YCCFERVCLFKP ITS 
R PANSERYWCKGLKVGIDD VRD YL FAVN I KLNQLRNTDSDVNL 
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Predicted end 
nucleotide 
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amino acid 
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Amino acid segment containing signal peptide 
CA= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, Ge=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M^Methionine, N«Asparagine, 
P^Proline, Q«Glut amine, R^Arginine, 
S-Serine, T=Threonine , V= Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknovm , *=Stop 
Co don, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








WPLEVIKGDHEFTDYMIRSWESHCSLQIKALAKlHAFVQDTTri 
SEPRQAEIRKECLRIWGIPDQARVAPSSSDPKSKFFELIQGTEI 
DI PSYKPTLLTSKTLEKI RPVFDYRCMVSGSEQKFLIGLGKSQI 
YTWDGRQSDRWIKLDLKTELPRDTLLSVEIVHELKGEGKAQRKI 
qb. T w tt .twt t vr .rafltTmro pnH pram? t nt .ivtr ypw aif q v d qd Dnwiw 

PIRVKEVYRfcEEMEKI FVKLEMKI I KGSSGTPKLSYTGRDDRHF 
VPMGL Y I VRTVNE PWTMGFS KS FKKKFFYNKKTKDST FDL PADS 
IAP FHICYYGRLFWEWGDG I RVHDSQKPQDQDKLS KEDVLSFIQ 
MHRA 


5533 


94 


789 


TVFENYTACLETEEQRVELSLWDTSGS P YYDNVR PLCYSDSDAV 
LLCFD I SRPET VDSALKKWRTE I LD YC P STRVLL I GCKTDLRTD 
LSTLMELSHQKQAPISYEQGCAIAKQLGPEIYLEGSAFTSEKSI 
HSIFRTASMLCLNKPSPLPQKSPVRSLSKRLLHLPSRSBLISPT 
FKKEKAKXCS IM 


5534 




owo 


TGVEKRYLAAGAVTLLSLYI^LPGYGASLLCNIi IGFVYPAYAS IK 
AIESPSKDDDTVWLTYWWYAliFGIiAEFFSDLLLSWFPFYYVGK 
CAFLLF CMAPRPWNGALMTj YQRWR PLFLRHHGAVDRI MNDLSG 
RALDAAAGITRNVKPSQTPQPKDK 


5535 


1029 


332 


NLQKLKNSERILTEAKQKMREbTVNI KMKEDL I KELI KTGNDAK 
S VS KQ YTLKVTKLEHD AEQ AKVELTE TQKQLQE LEN KDLS D VAM 
KVKLQKEFRKKVDAAKLRVQVLQKKQQDSKKIiASIiSIQNEKRAN 
ELEQSVDHMKYQKIQLQRKLQEENEKRKQLDAVIKRDQQKIKVl 
LSYIPAKYNMKC 


5536 


942 


282 


AAATAASLSPRGCRLRTPSSDVSPSRA?PPSAAPLPrSRAQMSP 
SGRLCLLTIVGLILPTRGQTLKDTTSSSSADATIMDIQVPTRAP 
DAVYTELQPTSPTPTWPADETPQPQTQTQQLEGTDGPLVTDPET 
HKSTKAAHPTDDTTTLSE RP S PSTDVQTDPQTIiKP SGFHEDD PF 
FYDEHTLR KRGI> LVAAVLFITGI 1 1 LT-SGKCRQLSRLCRNHCR 


5537 


3 


2391 


RARVSSPQLRVFRSGRPRRLRVLRINRTSVALRTjAGTGRFVAKT 
PGHPGSWEMGLLTFRDVAVEFSLEEWEHLEPAQKNLYQDVMLEN 
YRNLVSLGIiWSKPDLITFLEQRKEPWNVKSEETVAIQPDVFSH 
YNKDLLTEHCTBASFQKVISRRHGSCDIiENLHLRKRWKREECEG 
HNGCYDEKTFKYDQFDESSVESLFHQQXLSSCAKSYNFDQYRKV 
FTHSSLLNQQEE ID I WGKHHI YDKTSVLFRQVSTIjNS YRKVFIG 
EKNYHCNNS EKTLNQSSSPKNHQENYFLEKQYKCKEF2EVFLQS 
MHGQEKQEQSYKCWKCVE VCTQSIjKH I QHQTIH IREN S YS YNKY 

TEEKPYKWKE<^KVFNLNCSIiYI»TKQQQII>TGENI»YKCKACSKS 
FTRSSNLI VHQR IHTGEKPYKCKECGKAFRCSS YLTKHKR IHTG 
EKPYKCKECGKAFNRS SCLTQHQTTHTGEKLYKCKVCS KSYARS 
SNLIMHQRVHTGEKP YKCKECGKVFSRS S CLTQHRK IHTGENLY 
KCKVCAKPFTCFSNLIVHERIHTGEKPYKCKECGKAFPYSSHLI 
RHHRIHTG EKP YKCKACS KS FSDSSGLTVHRRTHTGEKP YTCKE 
CGKAFSYSSDVIQHRRIHTGQRPYKCEECGKAFNYRSYLTTHQR 
SHTGERPYKCEECGKAFNSRSYLTTHRRRHTGERPYKCDECGKA 
FSYRSYLTTHRRSHSGERPYKCEECfGKAFNSRSYLlAHQRSHTR 
EKL 


5538 


92€ 


161 


HSMMMKIPWGSIPVLMLLIjLLGLIDISQAQIjSCTGPPAIPGIPG 
IPGTPGPDGQPGTPGIKGEKGLPGLAGDHGEFGEKGDPGIPGWP 
GKVGPKGPMGPKGGPGAPGAPGPKGESGDYKATQKIAFSATRTI 
NVPLRRDQTIRFDHVI TNMNNNYEPRSGKFTCKVPGLYYFTYHA 
SSRGNLCVWLMRGRERAQKVVTFCDYAYNTFQVTTGGMVLKLEQ 
GENVFLQATDKMSLLGMEGANSIFSGFLLFPDMEA 


553 9 


38 


1258 


HRGPSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPG 
IVDGPAAIiASFPETVPAVPGPYGPHRPPQPIiPPGLDSDGLKREK 
DE I YGHPLF PLIjAIiVFE KCEIiATCS PRDGAGAGLG TP PGGDVCS 
SD S FNED I AAFAKQ VRS ERP L FS SNPELDNLVIQAI QVI*RFHLiL 
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Predicted end 
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sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D^Aspartic Acid, E«= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
HaHistidine, I^Isoleucine, K^Lysine, 
L»Leucine, ^Methionine, N=Asparagine , 

S=Serine, T=Threonine, V= Valine, 
W*Tryptophan, Y=Tyrosine, X=Unknown, *t=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ELKKVHDLCDNFCHRY I TCLKGKMPIDLVI EDRDGGCREDFEDY 
PAS CPSLPDQNNMWIRDHEDSGS VHLGTPGPS SGGLAS QSGDNS 
SDQGDGLDTS VAS PSSGGEDEDLDQERRRNKKRG I FPKVATNIM 
RAWLFQHIjSHPYPSEEQKKQLAQDTGLTILQVKNWFINARRRIV 
QPMI DQSNRTGQGAAFS PEGQP IGG YTETQ PHVAVRPPGS VGMS 
LNLEGEWHYL 


5540 


148 


1440 


PPLGAGAGVHARSPHPARRLPLTTAGVGGRAPDLLPTPWRQHRG 
PSGAAAPGCALPRGQALEGPRSCRRPQPMARRyDELPHYPGIVD 
GPAALASFPETVPAVPGPYGPHRPPQPLFPGLDSDGLKRBKDEI 
YGH PLFP LLALVF EKCELATCS PRDGAGAGLGTPPGGDVCS SDS 
PNEDNTAFAKQVRS2RPLPSSNPELDKTLMIQAIQVLRFHLLELE 
KGKMPlVLVIEDRDGGCRBDFEDYPASCPSLPDQmi^lRDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSYASPSSGGED 
EDLDQE PRRNKKRG I FPKVATNIMRAWLFQHLSHP YP SEEQKKQ 
LAQDTGLT I LQVNNWF INARRR I VQ P M IDQSNRTGQGAAFS PEG 
Q PIGG YTETEPfiVAFRAPAS VGDE FGTRKE EWH YL 


5541 


143 


1440 


P PLG AGAG VHARS PHPARRLPLTTAGVGGRAPDLLPT PWRQHRG 
PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPGIVD 
G PAALAS FP ETVPAVPGPYGPHRP PQPLP PGLDS DGLKRE KD E I 
YGHPLFPLLALVFE kcelatc S PRDGAGAGLGTP PGGDVCS SDS 
FKEDNTAFAKQWSERPLFSSNPELDNLMIQAIQVLRFHLLELB 
KGKMPIDLVIEDRDGGCREDFEDYPASCPSLPDQNNIWIRDHED 
SGS VHLGTPG PSSGGLASQSGDNS SDQG VG LDTSVAS PSSGGED 
EDLDQEPRRNKKRG X FPKVATN IMRAWLFQHLSHPYPSEEQKKQ 
IAQOTGriTXJjQVNNWFiriAHRRIVQPMIDQSNRTGQGAAFSPEG 
QP I GG YTET3 P HVAFRAPAS VGDE FGTRKEE WHYL 


5542 


14 8 


1440 


PPLGAGAGVHARSPHPARRLPLTTAGVGGRAPDLLPTPWRQHRG 
PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPGIVD 
GPAALA5FFETVPAVPGP YGPHRPPQPLP PGLDSDGLKREKDB I 
YGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCSSDS 
FNEDNTAFAKQVRSERPLFSSNPELDNLMIQAIQVLRFHLLELE 
KGKMPIDLVIEDRDGGCREDFEDYPASCPSLPDQNNIWIRDHBD 
SGS VHLGTPGPSSGGLAS QSGDNSSDQGVGLDTSVAS PSSGGED 
EDLDQEPRRKKKRG I FPKVATNIMRAWLFQHLSHP YPSEEQKKQ 
LAQDTGLT ILQVNNW F INARRRI VQPMIDQSNRTGQGAAF3 PEG 
QPIGGYTETEPHVAFRAPASVGDEFGTRKEEWHYL 


5543 


2405 


665 


RWVREQPWPLRTSEAVKTPALRPFPGPRGVSPFPKPDWGKSPAP 
KRPFSDSGAFWS PERRPG VLEAPRRRPVPAS FRAVP PKPTRVHG 

SSASRDRVLARTMIVADSECRAELKDYLRFAPGGVGDSGPGEEQ 
KESRARIW5PPJ3P3A_FTPVf»'frt/T.P Pfiaircr irnuT r*r pat mqcodt; 

DNIAVVMGLHPDYFTSFWRLHYLLLHTDGPLAS SWRHYIAIMAA 
ARHQCSYLVGSHMAEFLQTGGDPEWLLGLHRAPEFCLRKLSEINK 
LLAHR P WL I T KEH I QAL L KTGEKTWS LAE L I Q AL VLLTH CH S hS 
SFVFGCGILPEGDADGS PAPQAPTPPSEQS S PPSRDFLNNSGGF 
ES ARD VE ALMERMQQLQESLLRDEGTSQEEMESR FELE KSESLL 
VTPS AD I LEPS PHPDMLCFVEDPTFGYEDFTRRGAQAPPTFRAQ 
DYTWEDHGYSLIQRLYPEGGQLLDEKFQAAYSLTYNTIAMHSGV 
DTS VLRRAIWN Y I HCVFG I R YDD YD YGE VNQLLERNLKVY I KT V 
AC YPEKTTRRM YNL FWRHFRHS EKVHVWLLLLEARMQAAL L YAL 
RAITRYMT 


"T542 


1895 


514 


I^GLLGRQRLLLRMGAGRLGAPMERHGRASATSVSSAGEQAAGD 
PEGRRQEPLRRRASSASVPAVGASAEGTRRDRLGSYSGPTSVSR 
QRVESLRKKRPLFPWFGLDIGGTLVKLVYPEPKDITAEEEEEEV 
ESLKS I RKYLTSNVAYGS TGI RDVHLE L KDLTLCGRKGNLHF IR 
FPTHDMPAFIQMGRDKNFSSLHTVFCATGGGAYKFEQDFLTIGD 
LQLCKLDELDCLI KGILYI DSVGFNGRS QCY YFEWPADSEKCQK 
LPFDLKNPYPLLLVNIGSGVSILAVYSKDNYKRVTGTSLGGGTF 
FGLCCLLTGCTTFEEALEMASRGDSTPCVDKLVRDIYGGDYERFG 
LPGWAVAS S FGNMMS KEKREAVS KEDLARATLITI TNNIGSIAR 
MCALWENINQVVF VGIIFLRINTIAMRLLA YALD YWS KGQLKALF 
SEHEGYFGAVGALLELLKIP 
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amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
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corre sponding 
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residue of 
amino acid 
secjuence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=!soleucine, K=Lysine, 
L»Leucine, M=Methionine, N=Asparagine, 
P—Proline, Qs=Glut amine , R— Arginine, 
S^Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5545 


802 


131 


GAMWSAGRGGAAWPVLLGLLLALLVPGGGAAKTGAELVTCGSVL 
KLLNTHHRVRLH SI IDI KYGSGSGQQS VTGVEASDDANS YWRIRG 
GSEGGCPRGS P VRCGQAVRLTHVLTG KNLHTHHF PS PLSNNQEV 
SAFGEDGEGDDLDLWTVRCSGQHWEREAAVRFQHVGTSV?I>SVT 
GEQYGSPIRGQHEVHGMPSANTHNTWKAMEGIFIKPSVEPSAGH 
DEL 


5546 


1592 


146 


r VP KGG.-iSS MGQ SGRs RHQKRAKAQAQLRNLEAYAANPHS F V FT 
RGCTGRNIRQIiS LD VRRVMEPLTAS RLQVRK KETS LKDCVAVAG P 
LGVTHFLILSKTETNVYFKLMRLPGGPTIjTFQVKKYSLVRDWS 
SIiRRHRMHEQQFAHPPLLVLNSFGPHGMHVKI^IATMFQNLFPSI 
NVHKVNLNT IKRCtiLIDYNPDSQELDFRHYS I KWPVGASRGMK 
KLLQEKFPNMSRLQDISEUATGAGLSESEAEPDGDHNITELPQ 
AVAGRGNMRAQQS AVRLTE I GPRMTLQ LI KVQEGVGEGKVM FHS 
FVS KTEEELQAI LEAKEKKLRLKAQRQAQQAQNVQRKQEQREAH 
RKXSLEGMKKARVGGSD£EA5GIPSRTASLEIjGEDDDEQE.DDDI 
EYFCQAVGEAPSEDLFPEAKQKRLAKSPGRKRKRWEMDRGRGRL 
CDQKFPKTKDKSQGAQARRGPRGASRDGGRGRGRGRPGKRVA 


5547 


1592 


14 6 


FVPRGGHSSMGQSGKSRHQKRARAQAQtiRNTjE 
RGCTGRNIRQLSLDVRRVMEPLTASRLQVRKKWSLKDCVAVAGP 

lgvthflilsktetnvyfklmrlpggptltfqvkkyslvrdws 
slrrhrmheqq fahppllvlnsfgphgmhvklmatmfqnlfps i 
nvhicvnlntikrcllidynpdsqeldfrhysikwpvgasrgmk 
kllqekfpnmsrlqdisellatgaglseseaepdgdhnitelpq 
a vagrgnmraqqsa vrlte igprmtiqli kvqeg vgegkvmfhs 

FVSKTEEELQAIIjEAKEKKLRIiKAQRQAQQAQNVQRKQEQREAH 

rkkslegmkkarvggsdeeasgi psrtaf5 lelgedddeqeddd i 
eyfcqavgeapsedlfpeakqkrlakspgrkrkrmemdrgrgrl 
cdqkf p k.tkd ks qg aqarrg prg as r dggrgrg rgr pgkr va 


5548 


1 


2153 


DQTGPPETIAFTFPRSTMEPLCPLLLVGFSIiPLARALRGNETTA 
DSNETTTTSGPPDPGAS QPLLAWLLL PIjIJjXiIjLVIjIjIjAAYFFRF 

rkqrkawstsdkkmpng ileeqeqqrvmllsrs psgpkkyfpi 
pvehleeeirirsaddckqfreefnslpsghiqgtfelankeen 
reiotrypwilpwdhsrvilsoldgipcsdyinasyidgyxeknk 
fiaaqgpkqe tvnd fwrmvweqksati vmltnl kerke ekchqy 

WPDQG CWTYGNTR VCVEDCWLVDYTIR KFCI QPQLPDGCKAPR 

lvsqlhftswpdfgvpftpigmlkflkkvktlnpvhagpiwhc 
sagvgrtgtfi vidammammiiaeqkvdvfefvsr irnqrpqmvq 

TDMQYTF I YQALLE YYLYGDTE IjDVS SLEKHLQTMHGTTTHFDK 
IGLEEEFRKLTNVR I MKBNMRTGNLPANMKKARV I Ql I PYDFNR 

VTT.CMVBflrnfVTnVTMIi CJ?TTV2VOftvnvUT R*TV\r20T f\ l_I r P\rX.' r\ till 

RM I WEWKSHTI VMLTE VQB REQDKCYQYWPTEGS VTHGE ITI El 
KWDTLSEAIS IRDFL VTLNQPQARQE EQVRWRQFHFHG WPE IG 
IPAEGKGMIDLIAAVQKQQQQTGNHPITVHCSAGAGRTGTFIAL 
SNILERVKAEGLIiDVFQAVKSLRLQRPHMVQTLEQYEFCYKVVQ 
DFIDI FSDYANFK 


554 9 


915 


256 


FEATGGKRLAFKMAGTARHDREMAIQAKKKLTTATDPIERLRIiQ 
CLARGSAGIKGLGRVFRIMDDDNNRTLDFKEFMKGLNDYAWME 
KEEVEELFQRFDKDGNGTIDFNEFLLTLRPPMSRARKEVIMQAF 
RKLEKTGDGVIT I EDLRE VYHAKHHP KYONGE Wfi E ED VFRKFT^fi 
NFDSPYDKDGLVT PEE FMN YYAGVSAS IDTDVYFI IMMRTAWKL 


5550 


2364 


1210 


RKRKVFLKMRRLNRKKTLSLVKEIiDAFPKVPESYVETSASGGTV 
SLIAFTTMALLTI MEFSVYQDTWMKYE YEVDKDFSSKLRIN I DI 
TVAMKCQYVGADVLDIjAET^ASADGLVYEPTVFDT^SPQQKEWQ 
RMLQLIQSR1QEEHSLQDV1FKSAFKSTSTALPPREDDSSQSPN 
ACR IHGHL YVNKVAGNFHI TVGKAI PH PRGHAHIAALVNHES YN 
FSHRIDHLSFGELVPAIINPIiDGTEKIAIDHNQMFQYFITWPT 
KLHTYKISADTHQFSVTERER1INHAAGSHGVSGIFMKYDLSSL 
MVTVTE EHMPFWQ F FVRLCG I VGGI FSTTGMLHGIGKFIVE I IC 
CRFRLGSYKPVNSVPFEDGHTDNHLPLLENNTH 


5551 


211 


1700 I 


MQRDHTMDYKESCPSVSIPSSDEHREKKKRFTVYKVL.VSVGRSE 
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Predicted end 
nucleotide 
location 
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residue of 
amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, DsAspartic Acid, 3= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T» Threonine, vwvaline, 
W^Tryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








WFVFRRYAEFDKLYNTLKKQFPAMALKIPAKRiPGDNFDPDFIK 
QRRAGltNEFIQNLVRYPELYNHPDVRAFLQMDSPKHQSDPSEDE 
DERSSQKLHSTSQNINLGPSGNPHAKPTDFDFLKVIGKGSFGKV 
LLAKRKLDGKF YAVKVLQ KKI VLNRKEQKHIMAERNVIjLKNVKH 
P FLVGLH YS FQTTE Kli YFVLD F VNGGELFFHLQRE RS FPEHRAR 
FYAAEIASALGYLHSIKIVYRDLKPENILLDSVGHWLTDFGLC 
KEGIAI SDTTTT FCGTPEYLAPEVI RKQP YDNTVDWWCLGAVLY 
EMLYGLPPFYCRDVAEMYDNI LHKPLSLR PGVSL TAWS ILEELL 
EKDRQNRLGAKEDFLE IQNHPFFESLSWADLVQKKI PPPFNPKV 
AGPDD I RWFDTAFTEEIVP YS VCVSSDYS IVNAS VLEADDAFVG 
FSYAPPSEDLFL 


5552 


2748 


930 


LGPAAGAAMGKKHKKHKAEWRSSYEDYADKPI,EKPLKLVLKVGG 
SEVTELSGSGHDSSYYDDRSDHERERHKEKKKKKKKKSEKEKHL 
DDEERRKRKEEKKRKREREHCDTEGEADDFDPGKKVEVEPPPDR 
PVRACRTQPAENESTP I QQLLEHFLRQLQRKDPHGFFAFPVTDA 
IAPGYSMI I KHPMDFGTMKDKI VANEYKSVTEFKADFKLMCDWA 
MTYNRPDTVYYKLAKKILHAGFKMMSKQAALIjGNEDTAVEEPVP 
EWPVQVETAKKSKKPSREVISCMFEPEGMACSLTDSTAfiEHVL 
ALVEHAADEARDRINRFLPGGKMGYLKRNGDGSliLYSWNTAEP 
DADEEETHPVDLSSLSSKLLPGFTTIjGFKDERRNKVTFLSSATT 
ALSMQNNSVFGDLKSDEMELLYSAYGDBTGVQCALSLQEFVIQA 
GS YS KKWDDLLDQI TGGDHS RTLFQLKQRRNVPMKPPDEAKVG 

HLNLDETTKLLQDLHEAQAERGGSRPSSNLSSLSNASERDOHHL 
GS PS RLS VGEQ PDVTHDP YEFLQS PE PAASAKT 


5553 


74 


1095 


LGREAVYLVS RMDGP VAEHAKQE PFHWTPltLES WAliS QVAGM P " 
VFIiKCENVQPSGSFKI RG IGHFCQEMAKKGCRHLVCSSGGNAG I 
AAAYAARKLG IPATI VLPES TShQWQRLQG EGAE VQLTGKVWD 
EANLRAQEIiAKRDGWENVPP FDH PLI WKGHASLVQELKAVLRTP 
PGALVLAVGGGGLLAGWAGLLEVGW0HVPIIAME1WGAHCFNA 
A I TAGKLVTL PDI TS VAKS IiGAKT VAARAIiE CMQVCKI HS E WE 
DTEAVSAVQQLLDDERMLVEPACGAAIAAIYSGLLRRLQAEGCIi 
PPSLTSVWIVCGGNNINSRELQALKTHLGQV 


5554 


156 


2318 


CSGRTGGRGSLRPAEWVCIiTCKLSGAETRGLLCPALRTWlMKVL 
GRS FFWVLFPVTjPWAVQAVEHEEVAQRVI klhrgrgvaamqsrq 
WVRDS CRKIiSGLLRQKNAVLWKLKTAIGAVEKDVGLSDEEKLFQ 
VHTFE 1 FQKEUSIESENSVFQAVYGLQRALQGDYKDWNMKESSR 
QRLEALREAAI KEETEYWELLAAEKHQVEAX»KNMQHQNQSLSML 
DEILEDVRKAADRLEEBIEEHAFDDNKSVKGVNFEAVLRVEEEB 
ANSKQNITKREVEDDLGLSMIiIDSQWNQYILTKPRDSTIPRADH 
HF 1 KD I VT IGMliS L PCGWL CTAIGLPTMFGY I ICGVLLGPSGIiN 
SIKSIVQVETLGEFGVFFTLFLVGIiEFSPEKLRKVWiaSLQGPC 
YMTLLMIAFGLLWGHLLRIKPTQSVFISTCLSLSSTPLVSRFLM 
GSARGrjKEGDIDYSTVLLGMLVTQDVQIiGLFKAVMPTLIQAGAS 
ASSSIWEVLRILVLIGQILFSIiAAVFIiLCLVIKKYLIGPYYRK 
LHMESKGNKEILILGISAFIFLMLTVTELLDVSMELGCFLAGAL 
VSSCGPWTEEIATSIEPIRDFLAIVFFASIGLHVFPTFVAYED 
TVLVFLTLSVVVMKFLLAALVLStilLPRSSQYIKWIVSAGIiAQV 
S3FSFVLGSRARRAGVlSREVYLLILSVTTLSLLLAPVLWRAAt 
TRCVPRPERRSSli 


5555 


212 


1425 


LSLRTRETPAPPRCEAASQGRVGWRADAAAEEAVRSVWNRTRDR 
G TMAPCNI*S TFCLLLL YL IGAVI AGRD FYKILG VPRSAS IKD I K 
KAYRKLALQLHPDRNPDDPQAQEKFQDLGAAYEVLSDSEKRKQY 
DTlfGEEGLKDGHQSSHGDI FSHFFGDFGFMFGGTPRQQDRNI PR 
GSDI IVDLEVTLEEVYAGNFVEWRWKPVARQAPGKRKCNCRQE 
MRTTQLGPGR FQMTQE VGCDECPWVKL VNEER TLE VEI EPG VRD 
GMEY P F IGEGEPHVDGEPGDLRFRIKWKHP I FERRGDDIjYTNV 
TXSLVESL VGFEMDI THLDGHKVHISBDKXTRPGAKLMKGEGL 
PN FDNNNI KGSL 1 1 T FDVDF P KEQLTEEAREG IKQLLKQGS VQK 
VYNGLQGY 


5556 


"■ 5835 " " 


3346 


RTRGMS kkcvpmefee yllrmfqgtfyllqki tkdnnahtvksr 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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recidue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E«* 
Glutamic Acid, Fsphenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, M=Asparagine , 
P«sproline, Q=Glut amine, RaArginine, 
S»Serine , T=Threonine , V=Val ine , 
W=Tryptophan, Y=Tyrosine, X= Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 








LEELDES YIEKFTDFLRLFVS VHLRRIE S YSQ FPWE FtTtXFK 
YTFHQPTHEGYFSCLDIWTLFLDYLTSKIKSRLGDKEAVLtJRYE 
DALVLLLTEVLNRIQPRYNQAQLEELDDETLBDDQQTEWQRVLR 
QSLEWAKVMELLPTHAFSTLFPVLQDNLEVYLGIiQQFIVTSGS 
GHRI^ITAENDCRRLHCSIOTLSSLLQAVGRIAEYFIGDVFAAR 
FNDALTVVERLVKVTLYGSQIK1.YNIBTAVPSVLKPDLIDVHAQ 
SIAALQAYSHWLAQYCSEVHRQNTQQFVTLI STTMDAI TPLIST 

LRLVDKAQVLVCRALSNI LLLPWPNLPENEQQWPVRS INHASL I 
SALSRDYRNLKPSAVAPQRKMPLDDTKL I IHQTLS VLEDIVENI 
SGESTKSRQICTQSLQESVQVSLALFPAFIHQSDVTDEMLSFFL 
TLFRGIiRVQMGVPFTEQI IQTFLNMFTREQLAES ILHEG5TGCR 
WEKFLKILQWVQEPGQVFKPFLPSI IALCMEQVYPI I AERFS 
PDVKAELFEXiLFRTLHHNWRYFFKSTVLASVQRGIAEEQMENEP 
QFS AI MQAFGQSFLQPD I HLFKQNLFYLETLNTKQKLYHKKIFR 
TA^FQFVNVLLOJVIjVHKSHDLLQEEIGIAIYNMASVDFDGFFA 
AFLP2FLTS CDGVDANQKSVLGRNFKMDRVRRERGRAKRRAEWA 

RKPGTPJUVR.RGHIE ASGCGL CPPC 


" 5S57 


1712 


491 


VI LGAGLRDKDMWI P WGLPRRLRLSALAGAGRFCI LGSEAATR 
KHIiPARNHCGLSnsSPQLWPEPDFRNPPRKASKASLDFKRYVTD 
RRLAETLAQIYLGKPSRPPHLLLECNPGPGILTQALLEAGAKVV 
ALESDKTFIPHLESLGKNLDGKLRVIHCDFFKLDPRSGGVIKPP 
AMS S RGL FKNLGIEAVP WTAD I PLXWGMFPSRGEKRAL WKLAY 
DLYSCTSIYKFGRIEVWMFIGEKEFQKLMADPGNPDLYHVLSVI 
WQLACEIKVIiHMEPWSSFDIYTRKGPLENPKRRELLDQLQQKIiY 
LIQMIPRQNLFTKNIiTPMNYNIFFHLLKHCFGRRSATVIDHLRS 
XiTPLDARDILMQIGKQED3KWNMHPQDFKTLFETIERSKDCAY 
KWLYDETLEDR 


5558 


1509 


96 


TGWSMRLWTPVGVLTSLAYCLHQRRVALAELQEADGQCPVDRS 
LLKLKMVQWFRHGARSPLKPLPLEEOVEWNPQLLEVPPQTQFD 
YTVTNLAGGPKPYSPYDSQYHETTLKGGMFAGQLTKVGMQQMFA 
LGERLRKNYVE DIP FLS PTFNPQE VF I RSTN I FRNLE STRCLLA 
GLFQCQKEGP 1 1 IHTDEADSEVLYPNYQSCWSLRQRTRGRRQTA 
SLQPGISEDLKKVKDRMGIDSSDKVDFFILliDNfVAAEQAHNLPS 
CPMLKRFARM I BQRAVDT S LY I LP KED RES LQMAVGP F LH I LES 
NLLKAMDSATAPDKIRKIiYLYAAHDVTFIPLLMTLGIFDHKWPP 
FAVDLTMEL YQHLES KE WFVQL YYHGKEQ VPRGCPEX3LCPLDMF 
LNAMSVYTLS PEKYHALCS QTQVMBVGNEE 


5S59 


150 


1983 


PLAATAHFAKM SRVAKYRRQVS ED PD I DSLLETLS PEEME ELEK 
ELDVVDPDGSVPVGLRQRNQTEKQSTGVYNREAML2VFCEKETKK 
LMQREMSMDESKQVETKTDAKNGEERGRDASKKALGPRRDSDLG 
KEP KRGGT.K1CS FS RDRDE AGG KSGT3KP PCEE KT I RG I DKGR VP AA 
VDKKEAGKDGRGEERAVATKKE EE KKGSDRNTGLS RDKDKKREE 
MKEVAKKEDDEKVKGERRNTDTRKEGEKMKRAGGNTDMKKEDEK 
VKRGTGMTDTKKDDEKVKKNEPLHEKEAKDDSKTKrPEKQTPSG 
PTKPSEGPAKVEEEAAPSIFDEPLERVKNNDPEMT3VNVNNSDC 
ITNEILVRFTEALEFNTVVKLFALANTRADDHVAFAIAIMLKAN 
KTITS LN LDSNHI TGKG I LAI FRAL LQNNTL TE L R F HNQRH I CG 
GKTEME IAKLLKEaJTTLLKLGYHFEI^GPPJ4TVTNLLSRNMDKQ 
RQKRLQEQRQAQEAKGE KKDLLE V P KAGAVAKGS PXPSPQ PS P K 
PSPKNS PKKGGAPAAPPPPP PPLAP PLI MENLKNSLSPATQRKM 
GDKVL PAQEKNSRDQLLAAI RSSKLKQ LKKVE VPKLLQ 


5560 * 


9 


921 


SSWEFSALSVSMACLSPSQLQKFQQDGFLVLEGFIiSAEECVAM 
QQRIGEIVAEMDVPLHCRTEFSTQEEEQLRAQGSTDYFLSSGDK 
IRFFFEKGVFDEKGNFLVPPEKSINKIGHALHAHDPVFKSITHS 
FKVQTLARSIjGLQMPVWQSMYIFKQPHFGGEVSPHQDASFLYT 
E PLGRVLGVW I AVEDATLENG CLWF I PGSHTSG VSRRMVRAP VG 
SAPGTSFLGSEPARDNSLFVPTPVQRGALVLIHGEVVHKSKQNL 
SDRSRQAYTFHLMEASGTTWSPENWLQPTAELPFPQLYT 


5561 


2175 


1775 


CYFIFQFFSSPYPGLHPHQTPAPLPNPGLYPPPVSMSPGQPPPQ 
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Predicted 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A-Alanine, OCysteine, DsAspartic Acid, E=* 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H^Histidine, I^Isoleucdne, X^Lysine, 
L=Leucine, M^Methionine, M=*Asparagine , 
P* Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V=»Valine, 
W«=Tryptophan, Y=Tyrosine, X ^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion} 








QLIiAPTY F S APGVKNFGNPS Y P YAPGAXjP P P P P PHL Y PNTQAP S 
Q V YGG VT Y YNPAQQQVQ P KPS PP RRT PQP VTI KP PPPE WSRGS 
S 


5562 


342 


1385 


SSGKMDMAAAGAAGIj VRGLKAG VLSQ AD YIiNLVQCBTIjED I»KLH 
LQS 7D YGN FLANE AS PLTVS V IDDRLKEKMWE FRHMRNHAYE P 
LASFLDFITYSYMIDNVILLITGTLHQRSIAELVPKCHPLGSFE 
QMEAVNIAQTPAELYWAILVDTPLAAFFODCISEQDLDEMNIEI 
IRNTLYKAYLESFYKFCTXiLiGGTTADAMCPILE FEADRRAFI IT 
rNSFGTELSKEDRAKLFPHCGRLYPEGLAQLARADDYEQVKNVA 
D YYP E YKLL FEGAGSNPGD KTLEDR F FSHE VKLNKliAFLNQFHF 
GVFYAFVKLKEQECRNIVWIAECIAGRHRAKIDNYIPIF 


5563 


342 


1385 


SSGKlWMAAAGAAGLWGLKAGVLSQADYLtiLVQCETLBDLiaM 
LQSTDYGNFIAMEASPLTVSVIDDRLKEKMWEFRHMRNHAYEP 
IiAS FLDFITYS YMXDNVILLI TGTLHQRS IABLVP KQi PLGS FE 
QMEAVN I AQTPAE LYNAI LVDTPLAAF FQDCI SEQDLDEMN I EI 
IRNTL YKAYLES F YKFCTLLGGTTADAMCPI LFPF ariPT?a.PT tt 
INS FGTELSKEDRAKLFPHCGRLYPEGLAQLARADDYEQVKNVA 
DYYPEYKLLFEGAGSNPGDKTLEDRFFEHEVKLNKLAFLNQFHF 
GVFYAFVKLKEQECRNIVWIAECIAQRHRAKIDNYIPIF 


5564 


3 


914 


RVRRDKRAVWTARGRRRCGDSMSGGWMAQVGAWRTGALGIiALLL " 
LLGLGLGLEAAAS PLSTPTSAQAAGPSSGSCP PTKFQCRTSGIaC 
VPLTWRCDRDLDCSDGSDEEECRIEPCTQKGQCPPPPGLPCPCT 
vj v ou v-o vsur x u ajnjjkw LoKIAC Ltfv3 BIjKCTIjSDDCIPIjTWRCDGH 
PDCPDSSDELGCGTNEI LPEGDATTMGPPVTLES VTS LRNATTM 
GPPVTLESVPSVGNATSSSAGDQSGSPTAYGVIAAAAVLSASLV 
TATLLLLSWLRAQERLRPLGLLVAMKESLLLSEQKTSLP 


5555 


993 


138 


RWNSPNPARAGS ISRPQRAPGS VSAVAMTAAVFFGCAFIAFGPA ' 
LALYVFTIATEPUillPLIAGAFFWIiVSLrjISSLVWFMARVIID 
KKDGPTQKYLL I FGAFVSVY IQEMFRFAY YKLLKKASEGLKS IN 
PGETAPSMRljl^YVSOr.fSFf^TMQrtT/ir^ru'vrpr.criOT' r-ar^rnrr/i-ru 
GDSPQ FFXtYSAFMTLVI I LLHVFWG IV FFDGCE KKKWG I LL IVL 
LTHLLVSAQTFISSYYG INLASAFI ILVLMGTWAFLAAGGSCRS 
IiKLCLLCQDKNFliLYNQRSR 


5S6$ 


2043 


1232 


SHIQHHGRGAQAPVKMVSWM I S RAVVIiVFGMLYPAY YS YKAVKT " 

KNVKEYVRWMMYWIVFALYTVrSTVADQTVAWFPLYYELKIAFV 

IWLLSPYTKGASLI YRKFLH PLLSSKEREI DDYI VQAKERGYET 

MVNFGRQGLNLAATAAVTAAVKSQGAlTERJjRSFSMHDLTTIQG 

DEPVGQR P YQPLPEAKKKS KPAPSES AG YGI PLKDGDEKTDEEA 

EGPYSDNEMLTHKGPRRSQSMKSVKTTKGRKEVRYGSLKYKVKK 

RPQVYF 


55S7 
ccco 


1554 


233 


E FLGSG VS PDIANEDGLTALHQCCIDDFREMVOQIiLEAGANI NA " 
CDSECWTPLHAAATCGHIiHLVELLIASGANLLAVNTDGNMPYDL 
CDDEQTLDCLETAMADRGITQDS IEAARAVPELRMLDDIRSRI.Q 
AGADLHAPLDHGATLLHVAAANGFSEAAALLLEHRASLSAKDQD 
GWEPLHAAAYWGQVPLVELLVAHGADLNAKSLMDETPLDVCGDE 
EVRAKLIiELKHKHDALLRAQSRQRSLLRRRTSSAGSRGKVVRRV 
SLTQRTDL YR KQHAQEAI VWQQP P PT£ PE P PEDNDDRQTGAELR 
PP PPEEDNPEWRPHNGRVGGS PVRHLYS KRIiDRSVS YQLSPLD 
STTPHTLVHDKAHHTLADLKRQRAAAKLQRPPPEGPESPETABP 
GL PGDTVTPQPDCGFRAGGDP PLIiKLTAPAVEAPVERRP CCLLM 




1731 


587 


AEBRQPAS RRGAGT TAAMAASGPGCRS WCLCPE VPS ATFFTALL " 
SLLVSGPRLFLLQQPLAPSGLTIiKS EALRNWQ VYRLVTY I FVYE 
NP IS LLCGAI 1 1 WR FAGNFERTVGTVRHCPFTVI FA1 FS AIIFL 
SFEAVSSLSKLGEVEDARGFTPVAFAMLGVTTVRSRMRRALVFG 
MWFSVTjVPWLLLGASWLIPQTSFIiSNVCGLSIGLAYGLTYCYS 
IDLSERVALKLDQTFPFSLMRRISVFKYVSGSSAERRAAQSRKL 
NPVPGSYPTQSCHPHI>SPSHPVSOTQHASGQ!CLASWPSCTPGHM 
PTLPPYQPASGLCYVQNHFGPNPTSSSVYPASAGTSLGIQPPTP 
VNS PGTVYSGALGTPGAAGSKESSR VPMP 


5569 


2 


835 


3TPCPLAWERGSRSEDISVPGQKPPTCSSFSGMDVGPSSLPHLG " 
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ID 
NO: 
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beginning 
nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 

ami nn stf^^t^ 

residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L-Leucine, M-Methionine, N-Asparagine, 

S -Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








LKLLLliLLLL P LRGQANTGC YGI PGM P GLPGAPGKDG YDGlj PG P 
KJbc.i' (j 1 j?AXi?\a X Kva e ts&Q KGfc. PGIj r r GKNG PMC^PPGMPGVPG 

PMGIFGEPGEEGRYKQKFQSVFTVTRQTHQPPAPNSLIRFNAVL 
TNPQGDYDTSTGKFTCKVPGLYYFVYHASHTANLCVIiLYRSGVK 
WTFCGHTS KTNQVNSGG VLLRLQVG EEVWLAVND Y YDM VG I QG 
SDSVPSGFLIiFPD 


5570 


264 


946 


RDRRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRERGGGDT 
MSSPSPGKRRMDTDWKLlESKHEVTIIiGGLNEFWKFYGPQGT 
PYEGGVWKVRVDIiPDKYPFKSPSIGFMNKIFHPNtDEASGTVCl. 
DVINQTl^TALYDLTNXFESFLPQLLAYPNPIDPLNGDAAAMYLH 
RPEBYKQKIKEYIQKYATEBALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5571 


264 


946 


RDRRDRGG VATSTEEPAR PRAPQSRG PGPVS QTGRGRERGGGDT 
MS S PSPGKRRMDTDWKLI ES KHE VT I LGGLNE F WKF YGPQGT 
P YEGGVW EtVRVDLPDKY PFKS PS IGFMNKI FHPNIDEASGTVCL 
D V IWQTWTAI* YDLTNI FESFLPQLLA Y PNP IDPLNGDAAAM YLH 
RPEEYKQKIKEYIQKYATEEALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5572 


2802 


2085 


RTDYRTGIPGRRFRVMAAGDGDVKLGTLGSGSESSNDGGSESPG 
DAGAAAEGGGWAAAAIALLTGGGEMLLKVALVALVI.U5AYRLWV 
RWGRRGLGAGAGAGEESPATSZiPRMKKRDFSZ(E<}IiRQYDGSRNP 
RILLAVNGKVFDVTKGS KF YG PAGP YG I FAGRDASRGIiATFCLD 
XDALRDE YDDLS DLNAVQME S VREW EMQFKEKYD YVGRLLKPGE 
EPSEYTDEEDTKDHNKQD 


5573 


2562 


219 


VPART PNAEDQGP EARAAT AT PCQSGGRERAGEAAEDG VKMAAF 
SEMGVMPE IAQAVEEMDWLLPTD1QAES IPLILGGGDVLMAAET 
GSGKTGAFS IPVIQ I VYETI*KDQQEGKXGKTTI KTGASVLNKWQ 
MNPYDRGSAFAIGSDGLCCQSREVKEWHGCRATKGLMKGKHYYE 
VS CHDQGLCRVG WS TMQAS LDLGTDK FGFGFGG TGKKSHNKQFD 
NYGEEFTMHDTIGCYLDIDKGHVKFSKNGKDLGLAFEIPPHMKN 
QALFPACVLKNAELKFNFGEEEFKFPPKDGFVALSKAPDGYIVK 
SQHSGNAQVTQTKFLPNAPKALIVEPSRELAEQTLNNIKQFKKY 
IDNPKLRELL I IGG VAARDQLS VLENGVD I WGTPGRLDDLVST 
GKLNLSQVRFLVLDEADGLLSQGYSDFINRMHNQIPQVTSDGKR 
LQVI VCS ATLHS FDVKKLS EKI MHF PTWVDLKGEDSVPDTVHHV 
WPVNPKTDRLWERIiGKSHIRTDDVHAKDNTRPGANSPEMWSEA 
IKrLKGEYAVRAIKBHKMDQAIIFCRTKIDCDNLEQYFIQQGGG 
PDKKGHQFSCVCLHGDRKPHERKQNLERFKKGDVRFLICTDVAA 
RGID1HGVPYVINVTLPDEKQNYVHRIGRVGRAERMGLAISLVA 
TEKEKVX^YHVCSSRGKGCYWTRLKEDGGCTIWYNEMQLLSEIEE 

APTVQELAALEKEAQTSFLHLGYLPNQLFRTF 


5574 


1731 


952 


NEGIiEVFKEQELQPSDKGAVPEDASTERSAMASLGLQIiVGYlLG 
LU3LLGTLVAMLLPSWKTS S YVGAS I VTAVGFS KGLWMECATHS 
TGITQCDI YS TLLGLPADIQAAQM4MVTSSA1SSLAC X IS WGM 
RCTVFCQE SRAKDRVAVAGGVF F I LGGLLGF IP VAWNLHG I I*RD 
FYSPLVPDSMKFEIGEALYLGI ISSLFSLIAGX ILCFSCSCQRN 
RSNYYDAYQAQPLATRSS PRPGQPPKVKSEFNS YSLTGYV 


5575 


456 


766 


LLWALPCPPPTAAAAnjLSSTGLMEIiLEKMLALTLAKADSPRTAL 
SPDIGRKSPHYLMFP 


5576 


249 


2146 


RS WGAP W FWRMRLLRRRHMP LRLAM VGCAF VLFLFL LHRDVS S R 
EEATEKP WLKSLVSRKDHVLDLMLEAMNNLRDS MPKLQI RAPE A 
QQTLFSINQSCLPGFYTPAELKPFWERPPQDPNAPGADGKAFQK 
SKWTPLETQEKEEG YKKHCFNAFASDRI SLQRS LGPDTRPPECV 
DQKFRRCP PLATTS VI IVFHNEAWS TLLRTVYS VLHTTPAILLK 
EIILVDDASTEEHLKEKLEQYVKQLQWRWRQEERKGLITARIi 
LGASVAQAE VLTFLDAHCE CFHGWLE PLLARI AEDKTVVVS PD I 
VTIDLWTFEFAKPVQRGRVHSRGNFDWSLTFGWETLPPHEKQRR 
KDETYP IKS PTPAGGL FS 1 SKS YFEH I GT YDNQME I WGGENVEM 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue o£ 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D-Aspartic Acid, E» 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K= Lysine, 
Ii=Leucine, M=*Methionine, NT=Asparagine, 
Pa Proline, Q=Glut3rnine, R— Arc^inine^ 
S»Serine, T=Threonine , V»Valine, 
W«Tryptophan, Y«Tyrosine, X^Unknown , *^Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SFRVWQCGGQLE 1 1 PCS WGHVFRTKS PHTFPKGTSVI ARNQVR 
liAEVWMDSYKKIFyRRNLQAAKMAQEKSFGDISERLQLREQLHC 
HMFSWYLHNVYPEMFVPDLTPTFYGAIKNLGTNOCLDVGBNKRG 
GKPli I M YS CHGLGGNQYFE YTTQRDLRHN I AKQLCLHVSKGALG 
LGSCHFTGKNSCVPKDEEWELAQDQLIRNSGSGTCLTSQDKKPA 
MAPCNPS DPHQLWLFV 


5577 


3 


1275 


RNSDCS CGEI SVHCLP WVLFI LDLKVES SMFCPMCLI LLPVLLD 
YS LGLNDLNVSP P ELTVHVGDSALMGC VFQSTEDKCI FKIDWTL 
S PGEHAKDEY VLYYYSNLSVPIGRFQNRVHLMGDI LCNDGS LLL 
QP VQEADQGT YI CS 1 RLKGESQVFKKAVVh W/L PEEPKELMVHV 
GGLIQMGCVFQSTE VKHVTKVEW I FSGRRAKEEI VFRYYHKLRM 

IHLGNLVFKKTIVLHVSPBEPRTLVTPAALRPLVIiGGNQLVI IV 
GIVOTILLLPVLIIilVKKTCGNKSSVNSTVLVKNTKKTNPBIK 
EKPCHFERCEGEKHIYSPIIVREVIEEEEPSEKSEATYMTMKPV 
WPSLRSDRNNSLEKKSGGGMPKTQQAF 


5578 


3 


783 


AVESMAS PGAGRAP PELPERNCGYREVE YWDQRYQGAADSAP YD 
WFGDFSSFRAI,LEPELRPEDRILVLGCGNSAIiSYEI,FLGGFPNV 
TS VDYSS VWAAMQARYAHVPQLR WETMDVRKLDFP SAS FDWL 
EKGTLDALLAGERDPWrVSSEGVHTVDQVLSEVSRVL.VPGGRFI 
SMTS AAPHFRT RHYAQAYYGWSLRHAT YGSG FHFHLYLMHKGGK 
LS VAQ I ALGAQILS P PRPPTS PCFLQDSDHEDFLSAIQL 


5579 


3 


1540 


RNSGIARGASAI^HGGGUAGGVGWDCGACASRCQGVMEGLLTR 
CRALPALATCSRQL SG Y VPCRFHHCAPRRGRRHiLS RVFQPQNI* 
REDRVXjSLODKSDDLTCKSQRLMLQVGLIYPASPGCYHLLPYTV 
RAME KLVRVI DQEMQAI GGQKVNMPS I*SPAEIjWQATNRWDIjMGK 
E LLRLRDRHGKS YCIiG PTHEEAI TAXi IAS QKKIiS YKQL P FIX YQ 
VTRKFRDEPRPRFGLLRGRKFYMKDMYTFDSSPEAAQQTYSIjVC 
DAYCSLFNKLGI.PFVKVQADVGTIGGTVSHEFQLPVDIGEDRIA 
X C PRCSFS ANMETLDLSQMNCPACQG PLTKTKGI EVGHTF YLGT 
KYSSI FNAQFTNVCGKPTLAEKGCYGLG VTRZ LAAAI EVLS TED 
CVRWPSLLAPYQACLIPPKKGSKEQAASELIGQLYDHITEAVPQ 

LIIGEVLLDDRTHXjTIGNRbKDANKFGYPFVIIAGKRALEDPAHF 

TTT/w^rtMT'rtmriL i7T .TvnnVMriT t r r , 'oiTr\T*x7 
iivwLyn loJBV/ir Jul JMJkjVriui^L)i jpvqt, v 


5580 


1681 


450 


ACAGTRCIPGFWPSGAGYSAPAQRGRRSSGRMRAAAAPGLTAP 
WRLLQCCELEAGEIiGMAVPAAAMGPSALGQSGPGSMAPWCSVSS 
GPSRYVLGMQELFRGHSKTREFIoAHSAKVHSVAWSCIX?RRLASG 
S FDKTAS VFLLE KDRLVKENN YRGHGDSVDQLCWHPSNPDLFVT 
ASGDKT I R I WD VRTTKCIAT VNTKGENIN I CWS PDGQTIAVGNK 
DDWTFIDAKTHRSKAEEQFKFEVNEISWNNDNNMFFLTHGNGC 
INI hS YPELKP VQS INAHP SNCI C I KFDPMGKYFATGS ADAIiVS 
LWDVDELVC VRC FSRLDWPVRTLS FSHDGKMLASASEDH F IDIA 
EVETGDKLW E VQCES PT FTVAWH PKRPIaLAFACDDKDGKYDS SR 
EAGTVKLPGLP2JDS 


5581 


54 


947 


GGGSG PRAP SATHjDTGESVAAVASGE DKG I AASAAAAAVFACS 
CSPDPQSSTMNPVY5PVQPGAPYGNPKNMAYTGYPTAYPAAAPA 
YNPSL YPTNS PS YAPEFQFIjHSA YATLLMKQAW PQNSSS CGT3G 
TFHLPVDTGTENRTYQASSAAFRYTAGTPYKVPPTQSNTAPPPY 
SPSPWP YQTAM YP IRS A YPQQtflj YAQGAYYTQPVYAAQPH VIHH 
TTWQPNSIPSAIYPAPVAAPRTNGVAMGMVAGTTMAMSAGTLL 
TTPQHTAIGAHP VSMPTYRAQGTPAYS YVPPHW 


5582 


5775 


2739 


I ITNNNNVI I PLVI AYHLSGSAQARGERS PAERXMERQKRKADI 
EKGLQFI QSTL PLKQEE YEAFLIjKIjVQKLFAEGNDL FRE KD YKQ 
ALVQ YMEGIitJVADYAASDQ VAL PRELLCKLHVNRAAC YFTMGL Y 
EKALEDS EKALGLDSES IRALFRKARALNELGRHKEAYECSSRC 
SLALPHDESVTQLGQELAQKLGLUVRKAYKRPQELETFSLIiSNG 
TAAGVADQGTSNGLGSIDDIETDCYVDPRGSPALLPSTPTMPLF 
PHVLDLLAPLDSSRTLPSTDSLDDFSDGDVFGPELDTLLDSLSL 
VQGGLSGSGVPSELPQIiIPVFPGGTPLLPPWGGS I P VSSPLPP 
ASFGLVMDPSKKLAASVU3ALDPPGPTLDPLDLLPYSETRLDAL 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C~Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=sGlut amine , R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
W»Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DS FGSTRGSLDKPDS FMS BTNSQDHRPPSGAQKPAPSPfePCMPN 
TALLIKNPLAATHEFKQACQLCYPKTGPRAGDYTYREGLEHKCK 
RDI LLGRLRSSEE5QTWKR IRPRPTKTS FVGS YYLCKDMINKQDC 
KYGDNCTFAYHQEE IDVWTEERKGTLNRDLLFDPLGGVKRGS LT 
IAKLLKEHQGIFTFIiCEICFOSKPRIISKGTKDSPSVCSNLAAK 
HSFyNNKCLVHIVRSTSIiKYSKXRQFQBHFQFDVCRHEVRyGCL 
REDSCMFAHSFIELKVWLLQQYSGMTHEDIVQESKKYWQQMEAH 
AGKASSSMGAPRTHGPSTFDLQMKFVCGQCWRNGCVVEPDKDLK 
YCSAKARHCWTKERRVUUVMSKAKRKWVSVRPLPSIRNFPQQYD 
LCIHAQNGRKCQYVGNCSFAHSPEERDMWTFMKENKILPMQQTY 
DMWLKKHNPGKPGEGTPI S SREGEKQIQMPTDYAD I MMGYHCWL 
CGKNSNSKKQWQQHIQSEKHJCEKVFTSDSDASGWAFRFPMGEFR 
LCDRLQKGKACPDGDKCRCAHGQEEL^WLDRREVLKQKIAKAR 
KD MLLCPRDDDFG K YNFLLQEDGDLAGATPEAP AAAATATTGE 


5583 


3 


1265 


SSGCRQGRPGRSDRPRPPPRRHKMVKETRYYDILGVKPSASPEE 
IKKAYRKIiALKYHPDKNPDEGEKFKLISQAYEVLSDPKKRDVYD 
QGGEQ AIKEGG SGS PS FS S PMD I FDM FFGGGGRMAR ERRGKNW 
HQLSVTLEDLYNGVTKKLALQKNVIC3KCEGVGGKKGSVEKCPL 
CKGRGMH1HIQQIGBGMVQQIQTVCXECKGQGER1NPKDRCESC 
SGAKVI REKKI I EVHVEKGMKDGQKILFHGEGDQEPELEPGDVI 
IVLDQKDHSVFQRRGHDL IKKMKIQLSEALCGFKECTI KTLDNRI 
LV I TS KAGEV I KHGDLRC VRDEGM P I YKAPLE KGI L I IQFLVI F 
PEKH^SLEKLPQLEALLPPRQKVRITDDMDQVELKEFCPNEON 


55B4 




1265 


SSGCRQGRPGRSDRPRP PPRRHKMVKETRYYD ILGVKPSASPEE 
I KKAYRKLALKYHPDKNPDEGEKFKL I S QAYE VLS DP KKRDVYD 
QGGEQAIKEGGSGSPSFSSPMDIFDMFFGGGGRMARERRGKNW 
HQhSVTLEDLYmvTKKhAJUQSWVlCEKCBGVGGKKGSVEKCPIi 
CKGRGMHIHIQQIGPGMVQQIQTVCIECKGQGERINPKDRCESC 
SGAKVIREKKIIEVHVEKGMXDGQKILFHGEGDQEPELEPGDVI 
IVLDQKDHSVFQRRGHDL I MKMKI QLS E ALCGFKKTIKTLDNRt 
LVI TSKAGEVIKHGDLRCVRDEGMPI YKAPLE KG TCjUQFLVIF 
PEKHWLSLEKLPQLEALLPPRQKVRITDDMDQVEIiKEFCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


5585 


2619 


915 


LPAGTPESSLHEALDQCMTALDIiFLTNQFSEALSYLKPRTKESM 
YHS LT YATI LEMQAMMTFDPQDI LLAGNMMKEAQ'M LCQRHRR KS 
5 VTDSFSSLVKR P TLGQFTEEE IHAE VCYAKCLLQRAALTFLQD 
ENMVSF I KGGIKVRNS YQT YKELDSLVQSSQYCKGEtJHPHFEGG 
VKLGVGAFNLTLSMLPTRILRLLEFVGFSGNKDYGLLQLEEGAS 
GHSFRSVLC^/MLLLCYHTFLTFSn^TGlTWIEEAEXLLKPYLNR 
YPKGAI FLFLAGRIE VI KGHIDAAIRRFEECCEAQQHWKQFHHM 
CYWEIiMWCFTYKGQWKMSYFYADLLSKENCWSKATYIYMKAAYI* 
SMFGKE DHKPFGDDE VELFRAVPGLKLKI AGKS L PTE KFA I RKS 
RRYFSSNPISLPVPALEMMYI WNGYAVIGKQPKLTDGILE I ITX 
AEEKLEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
ISANEKKIKYDHYLIPNALLELALLLMEQDRNEEAIKLLBSAKQ 
NYKNYSMESRTHFRIQAATLOAKSSLBNSSRSMVSSVSIi 


5586 


2619 


915 


LPAGTPESSLHEALDQCMTALDLFLTNQFSEALSYLKPRTKESM 
YHSLTYATILEWQAMMTFDPQDILIAGKMMKEKQMLCQRJHRRKS 
SVTDSFSSLVNRPTLGQFTEEEIHAEVCYAKCLLQRAAIjTFLQD 
ENMVSFIKGGIKVRNSYQTYKELDSLVQSSQYCKGENHPHFEGG 
VICLGVGAFNLTLSMLPTRIIiRLLEFVGFSGNKDYGLLQIiEEGAS 
GHSFRSVLCVMLLLCYHTFLTFVLGTGNVN IEEAEKLLKPYLNR 
YPKGAIFLFLAGRIEVIKGNIDAAIRRFEECCEAQQHWKQFHHM 
CYWELMWCFTYKGQWKMSYFYADLLSKENCWSKATYIYMKAAYL 
SMFGKEDHKPFGDDBVELFRAVPGLK1KIAGKSLPTEKFAIRKS 
RRYFSSNP I SLPVPALEMMYI WNGYAVIGKQPKLTDGILE I ITK 
AE EMLEKG P ENEYS VDDECLVKLLKGLCLKYLGRVQE AEENFRS 
ISANEKKI KYDHYL IPWALLELAr.LLMEQDRHEEAIKLLESAKO 
NYKNYSMESRTHFR IQAATLQAKSS LENSSRSMVSSVSL 


5587 


1768 


" 148 


SSAVPPGAVGRPVAVAVGGPPHSCRCRPCCIiMAAIGVHLGCTSA 
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nucleotide 
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Amxno acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, GeGlycine, 
H=Histidine, I=>Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R-Arginine r 
S=Serine, T«*Threonine , V»Valine, 
W=*Tryptophan, Y«Tyrosine, X=Unknown, *=Stox> 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








CVAVYKDGRAG WANDAGDRVTPAWAYSENEE I VGLAAKQS RI 
RNISNTVMKVKQILGRSS SD PQAQKYI AESKCLVIEKNGKLR YE 
IDTTGEETKFVKPEDVARLI FS KMKETAHSVLGSDANDWIT VPF 
DFGEKQKNALGEAARAAG FNVLRL I HEPSAAL1AVG IGQDS PTG 
KSNILVFKLGGTSLSLSVMEVNSGIYRVLSTNTDDNIGGAHFTE 
rrAQYLASBFQRSFKHDVRGNARAMMKIiTNSAEVAKHSLSTLGS 
ANCFLDSUYEGQDFDCNVSRARFEXjLCS PLFNKCI EAI RGIiLDQ 
NGFTADD INKWLCGGSSRI PKLQQLI KDLFPAVELLNS I P PDE 
VIPIGAAIEAGILIGKENLLVEDSLMIECSARDILVKGVDESGA 
SRFTVLFPSGTPLPARRQHTLQAPGSISSVCLELYSSDGKNSAK 
EETKFAQWLQDLDKKENGLRDILAVLTMKRDGSLHVTCTDQET 
GKCBAISIEIAS 


5588 


3 


589 


TPPPPEQAMVAATVAAAWtiLLWAAACAQQEQDFYDFKAVNIRGK 
LVSLEKYRGSVSLWNVASECGFTDQHYRALQQLQRDLGPHHFN 
VLAFPCNQFGQQEPDSNKEIESFARRTYSVSFPMFSKIAVTGTG 
AHPAFKYLAQT SGKEPTWNFW KYLVAPDGKWGAWD PTVS VEEV 
RPQITALVRKLILLKREDI* 


5589 


1884 


553 


LRQAWHEGG IGQTDKERGAAAL PGE EGD PTRGRSLGRASW ESGS 
P RRPRS P FS 5 FLPRP I CLS LEARPCS I EDRRNWS L IGRPGAP AS 
GLNRSSGLWLGPDRCRPRSRCSCRVMENPSPAAAliGKALCALLLa 
ATLGAAGQPLGGES ICS ARAPAKYS I TFTGKWSQTAFPKQYPLF 
RPPAQWSSLLGAAHSSDYSMWRKNQYV5WGLRDFAERGEAWALM 
KEI EAAGEALQS VHAVFSA P AVP SGTG QTS AELEVQRRHS LVS ? 
WRIVPSPDWFVGVDSLDLCDGDRWREQAAUDLYPYDAGTDSGF 

TFS S PNFATI PODT VTE ITS S S PSHPAMS F YYPH.IjICA'LiPP T At? V 

TLLRLRQS PRAF 1 P PAP VLP S RDNE I VDS AS V PETPLDCEVS LW 
SSWGLCGGHCGRLGTKS RTR YVR VQPANNGS FCPEL EEEAEC VP 
DNCV 


5590 


72 


896 


LCSSGALRLLPAMVAWRSAFLVCIiAFSLxATLVQRGSGDFDDFNLt 
EDAVKETSS VKQP WDHTTTTTTNRPGTTRAPAKP PGSGLDIADA 
LDDQDDGRRKPG1GGRERWNHVTTTTKRPVTTRAPANTLGNDFD 
IiADALDDRNDRDDGRRKP IAGGGGFSD KDLEDI VGGG E YKP DKG 
KGDGRYGSMDDPGSGMVAEPGTIAGVASALAMALIGAVSSYISY 
QQKKFCFS I QQGI^ADYVKGENLEAVVCEE PQVKYSTLHTQSAE 
PPPPPEPARI 


5591 


68 


1494 


AGS SRRAAAE RLLVS AGCRSIiAGRASG VLLLPAEL LPGEEEAMA 
LRVTRNSKINAENKAKINMAG AKRVPTAP AATSKPGLR PRTALG 
DIGNKVSEQLQAKMPMKKEAKPSATGKVIDKKLPKPLEKVPMLV 
PVPVSEPVPEPEPEPEPEPVKEEKLSPEP ILVDTASPS PMETSG 
CAPABEDLCQAS3DVI hAVNDVDAEDGXDPKLCSE YVKDIYAYL 
RQLEEEQAVRPKYIiIXSREVTGWMRAILIDWLVQVQMKFRLLQET 
MYMTVS I IDRFMQNNCVP KKMLQ LVG VTAMF IAS K YEEM YPPB I 
GDFAFVTDNTYTKHQIRQMEMKILRALNFGliGRPLPLHFLRRAS 
KIGEVDVEQHTtAKYLMELTMLDYDMVHFPPSQIAAGAFCLALK 
IL0NGE WTPTLQH YLS YTEESLLPVMQHIAKN^VAMVKTQGI*TKHM 
TVKNKYATSKHAKISTLPQIJ^SAIiVQDLAKAVAKV 


5592 


242 


924 


YGES KDWNQKDLLS ALVLTTVNCLPTP I MAKSAEVKIAIFGRAG 
VGKS AIjVYR FliTKR F I WE YD PTIiES TYRHQATI DDE WSMEILD 
TAGQEDTIQREGHMRWGEGFVLVYDITDRGSFEEVLPLKNILDE 
IKKP KNVTL I IiVGNKADLDHS RQ VS T EEGE KIATEIACAFYECS 
ACTGEGNITE I FYELCRE VRRRRMVQGKTRRRS STTHVKQAINK 
MLTKISS 


5593 


3 


1113 


HASGGRAANMAAERGAGQQQSQEMMEVDRRVESEESGDEEGKKK 
SSGIVADIjSEQSLKDGEERGEEDPEEEHELPVDMETINLDRDAE 
DVDLNHYRlGKIEGFEVLKKVKTLCIiRQNIilKCIENLEELQSLR 
BLDL YDNQI KKIENLEALTELE ILDISFNLLRNIEGVDKLTRLK 
KLFLVNNKISKIENLSNLHQLQMLEI.GSNRIRAIENIDTLTNLE 
SLFLGKNKITiCLQNLDALT^TVLSMQSNRLTKIEGLQN^VNLR 
ELYLS HNG I EVI EGtiENNNKLTMLD I ASNR I KKIENI SHLTELQ 
EFWMNDNLLESWSDLDELKGARSLETVYLERNPLOKDPQYRRKV 
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S^Scrine, T=Threonine, V=Valine, 
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Codon, /^possible nucleotide deletion, 
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MLALPSVRQ IDATFVRF 


5594 


3 


1113 


HAS GGPJ\ANMAAERG AGQQQSQEMME VDRRVE S EESGDEEGKKH 
SSGI VADLS EQS LKDGEERGE EDPEEEHBLPVDMET INLDRDAE 
DVDLNHYRIGKI EGFEVLKKVKTLCLRQNLI KC IENLEELQSLR 
ELDLYDNQI KKI ENLE ALTELE I LD I SFKTjLRNIEGVDKLTRLX 
KLFLVNNKISKXSNIjSNLHQIjQMLELGSKRIRAISNIDTLTNLE 
SLFLGKNKI T KLQNLDALTNLT VLSMQSNRLT KI EGLQNLVNLR 
EL YLSHNG I E VI EGLENNN KLTMLD I ASNRI KKI 3N I SHLTELQ 
EFW^DNLLESWSDLDELKGARSLETVYLERNPLQKDPQYRRKV 
MLALPSVRQIPATFVRP 


SS95 


3 


1476 


ARWNGRWVQVPAWPGPGCGTNASGERQRQLPRAWRPVGRTLGSE 

LG I PT VFGfCVTIiQKDAQNtil GI S I GGGAQ YCPCL YI VQ VFDNTP 
AALDGTVAAGDE I TGVNGRS I KGKTKVE VAKM I QE VKGE VT I HY 
NKLQADPKQGMSLDIVLKKVKHRLVENMSSGTADALGLSRAILC 

NDGLVKRL»EEI>ERTAEiy^ 

VFSVIGVREPQPAASBAFVKFADAHRS IEKFG IRLLXTI KPMLT 
DLNTYLNKAI PDTRLTI KKYLDVKFEYLSYCLKVKEMDDEEYSC 
IAIX3EPLYRVSTGNYEYRLILRCRQEARARFSQMRKDVLEKM3L 
TJDQKHVQDIVFQLQRLVSTMSKYYNDCYAVLRDADVFP I E VDLA 
HTTLAYGLNQEEFTDGEEEEEEEDTAAGEPSRDTRGAAGPLDKG 
GSWCDS 


559S 


£98 


219 


GAVLAPSSLPAAELAAQGES QSLBDLSNTSRPTSE VYKIS FI FP 

ngdkydgdctr'ts SG I YERNGIGIHTTPNG I vytgswkddkmng 

FGRLEHFSGAVYEGQFKDNMFHGIiGTYTFPNGAKYTGMFNENRV 
KGEGEYTHIQGTRMDWTFHFTSCSQT 


5597 


3 


731 


I SCKMAADGQSSLPASWRS VTLTHVEYPAGDLSGHLLAYLSLS P 
VFVI VGFVTLI I FKRELHTI SFIjGGLALNEGVNWL I KNVIQEPR 
PCGGPHTAVGTKYGMPS SHS Q FMWF FS VYS FLFLY LRMHQTNNA 
RFIJ5LLWRHVLSLGLLAVAFLVSYSRVYLLYHTWSQVLYGGIAG 
GLMAIAWFIFTQEVLTPLFPRIAAWPVSEFFLIRDTSLIPNVLW 
FEYTVTRAEARNRQRKLGTKIiQ 


5596 


326 


2440 


GIGPIAASFIFCKVASLYIFLSPPPPSVSGVPYSPANSSWSCAL 
VPLLGSGVPPHPPAPS PCCSGQTMLKMLS FKLLLLAVALGFFEG 
DAKFGERNEGSGARRRRCLKGNPPKRLKRRDRRMMSQLELLSGG 
EMLCGGF YPRLS CCIiRSDS PGLGRLENKI FS VTNNTECGKLLEE 
I KCALCS PHSQS LFHS PEREVLERDLVLPLLCKDYCKEFFYTCR 

QMEEYDKV^EISRKHKHKCFCXQEWSGLRQPVGAIjHSGDGSQR 

lfilekegyvkiltpegeifkepyldihklvqsgikggdergll 
sij^pnykkkgklyvsyttnqerwaigphdhilrvveytvsrk 
nphqvdlrtarvflevaelhrkhlggqllfgpdgflyi i lgdgm 
itlddmeemdgi*sdfrgsvlrij3vdtdmc!nvpys i prsnphfus 

TNQPPEVFAHGI»HDPGRCAVDRHPTDININLTILCSDSNGKNRS 
SARILQI IKGKDYESE PSLLEF KPFSNG P LVGGF VYRGCQS ERL 
YGSYVFGDRNGNFIjTLQQSPVTKQWQEKPLCLGTSGSCRGYFSG 
HILGFGEDEI>GEVYILSSSKSMTQTHNGKLYKIVDPKRPIjMPEE 
CRATVQPAQTLTSECSRLCRKGYCTPTGKCCCSPGWEGDFCRTG 


5599 


326 


2440 


gigpx aas fi fc kvas lyi flsppppsvsgvp ys pans swscal 

VP LLGS GV P P HP PAPS P CCS GQ TMLKM LS FKLLLLAVALGF PEG 
DAKFGERNEGSGARRRRCLNGNPPKRLKRRDRRMMSQLELLSGG 
EMLCGGF YPRLS CCLRSDSPGLGRtJENKIFSVTNNTECGKLLEE 
IKCALCSPHSCJSLFHSPEREVLERDLVLPLIiCKDYCKEFFYTCR 
GH I PG FLQTTADEFCF Y YARKDGGLCFPD F PRKQVRG PASNYLD 
QMEEYDKVEEISRKHKHNCFCIQEVVSGLRQPVGALHSGDGSQR 
LFILEKEGYVKILTPEGEIFKEPYLDIHKLVQSGIKGGDERGLL 
S LAFHPNY KKNGKLYVS YTTNQERWAIGPHDHI LRWE YT VS RK 
NPHQVDLRTARVFLEVAELHRKHLGGQLLFGPDGFLYI I LGDGM 
ITLDDMEEMDGL£DFTGSVJCJ?LDVDTDMCNVPYSIPRSNPHFNS 
TNQP PE VFAHGLHDPGRCAVDRHPTDXNINLT XLCSDSNGKNRS 
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S AR I LQ 1 1 KGKD YES E P S LLE F KPFSNG P L VGG F VYRG CQS ERL 
YGSYVFGDRNGNFLTLQQSPVTKQWQEKPLCLGTSGSCRGYFSG 
HI LGFGEDELGE VYILSS S KSMTQTHNGKLYKI VDPKRPLMPEE 
CRATVQPAQTLTSECSRLiCRNGYCTPTGKCCCSPGWEGDFCRTG 


56O0 


1977 


1244 


S LRVLSGHLMQTRDLVQPD KPAS PKF I VTLDGVP SPPGYMSDQE 

SNAEMSELSVAQKPEKLLERCKYWPACKNGDECAYHHPISPCKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRIPVLSPKP 
AVAPPAPPSSSQLCRYFPACKKMECPFYHPKHCRFNTQCTRPDC 


5601 


1977 


1244 


S LRVLSGHLMQTR DLVQ P D KPAS PKF I VTLDG VPS P PG YM S DQE 
EDMCFEGMKP VNQTAASNKGLRGLLH PQQLHLLSRQLEDPNTGS F 
SNAE MSELS VAQK PEKLLERCKYWPACKNGDECAYHHP I S P CKA 
FPNCKFAE KCLFVHPNCKYDAKCTKPDCPFTHVSRRI P VLS PKP 
AVAf A* Air b'bbbUijCRYr PACKKMBCPF i HPKHCRrNTQCTRPDC 
TF YHPTINVPPRHALKW I RPQTS E 


S602 


246 


766 


YHTSCTVWRTAKEALENTEVPVGCtjMVYNNEWGKGRNEVNQTK 
NATRHAEMVAIDQVtiDWCRQSGKSPSEVFEHTVLYVTVEPCIMC 
AAALRIiMKI PL WYGCQN ER FGG CGSVLNI ASADLPNTGRP FQC 
I PG YRAEEAVEMLKTFYKQENPNAP KS KVRKKE CQQ I LNMF 


S603 


1 


565 


FRGRTP I SGGERGCAQYP I PATPARSGENRTM PGAGDGGKAPAR 
WLGTGLLGLFLLPVTLSLEVSVGKATDIYAVNGTEILLPCTFSS 
CFGFEDLHFRWT YNS SBAF K I L I EG TVKNE KSDP KVTLKDDDR I 
TLVGSTKEKRNNIS I VLRDLEFSDTGKYTCHVKNPKENNLQHHA 
T I FLQWDRRMQ 


5604 


1 


1506 


EDIFPAQLLKLQRHERVWQQE P P VRDHRS WGGSGAGG VAGRE WT 
DQGQVALGGHYMAEGEG Y FAMS EDELACS P YI P LGGDFGGGD FG 
GGDFGGGD FGGGDFGGGGS FGGHCLD YCES PTAHCNVLNWEQVQ 
RLDG ILSETIPI HGRGNF PTLE LQPS LI VKVVRRRLAEKRIGVR 
DVRLNGSAASHVLHQDSGLGYKDLDL I FCADLRGEGE FQTVKDV 
VLDCLLDFLPEGVNKEKITPLTLKEAYVQKMVKVCNDSDRWSLI 
SLSNNSGKNVELKFVDSLRRQFEFSVDSFQIKLDSLLLFYECSE 
NPMTETFHPrilGESVYGDFQEAFDHLCWKIIATRNPEEIRGGG 
LLKYCNLLVRGFR PASDE I KTLQRYMCSRFFI DFSDIGEQQRKL 
ESYLQNHFVGLEDRKYEYLMTLHGWNESTVCLMGHERRQTLNL 
ITMLAIRVLADQETVI PNVANVTC Y YQ PAP YVADANFSN Y YI AQV 
QPVFTCQQQTYSTWLPCK 


5605 


35 


1821 


SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRAWSAGGPALGL 

MAAPVRLGRKRPLPACPNPLFVRWLTEWRDEATRSRHRTRFVFQ 

KALRSLRRYPLPLRSGKEAKILQHFGDGLCRMLDERLQRHRTSG 

GI)IlAPDSPbC»JENSPAPQGRIiAEVQuSSMPV 

ARHS GARVI LLVL YREHLNPNGHH FLTKE ELLQRCAQKS PR VAP 

GSARPWPALRSLLHRNLVLRTHQPARYSLTPEGLELAQKLAESE 

GLSLLNVGIGPKEPPGEETAVPGAASAELASEAGVQQQPLEIiRP 

GEYRVLLCVDIGETRGGGHRPELLRELQRLHVTHTVRKLHVGDF 

WJVAQETNPRDPANPGELVLDHI VERKRLDDLCSS I IDGRFREQ 

KFRLKRCGLERRVYLVEEHGSVHNLSLPE5TLLQAVTNTQVIDG 

FFVKRTADIKESAAYLALLTRGLQRLYQGHTLRSRPWGTPGNPE 

SGAMTSPNPLCSLLTFSDFNAGAIKNKAQSVREVFARQLMQVRG 

VSGE KAAALVDR YSTPAS LLAAYDACATP KEOETLLSTI KCGRL 

QRNLGPALSRTLSQLYCSYGPLT 


5606 


3 


1099 


GRSRCPGPGARGGTMSPRSCLRSLRLLVFAVFSAAASNWLYLAK 
LSSVGSISEEETCEKLKGLIQROVQMCKRNLEVMDSVRRGAQLA 
IEE CQ YQFRNRRWNCS TLDSbPVFGKVVTQGTREAAFVYA I S S A 
GVAFAVTRAC S SGELEKCGCDRT VHGVSPQG FQWSGCS DN I AYG 
VAFSQSFVDVRERSKGASSSRALMWLHNNEAGRKAILTHMRVEC 
KCHG VSGS CE VKT CWRAVP P FRQVGHALKEKFDGATE VE PRRVG 
SSRALVPRNAQFKPHTDEDLVYLEPSPDFCEQDMRSGVLGTRGR 
TCNKTSKAIDGCELLCCGRGFHTAQVELAERCSCKFHWCCFVKC 
RQCQRLVELHTCR 



350 



WO 01/53312 PCT/US00/34263 



SEQ 
ID 
WO: 


Predicted 
beginning 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylala*iine, G=Glycine, 
H^Histidine, I=Isoleucine, KcLysine, 
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^-Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion} 


5607 


521 


141 


PPVCNPAEAMPS PGT VCSLL1.LC3MI .WLDLAMAGSS FLSPEhQRV 
QQRKESKKP P A KLQPRALAGWLRP EDGG Q AEGAEDELE VRFNA P 
FDVG IKLS G VQ YQQHSQALGKFLQD I LW EEAKEAPAD K 


5608 


2 


983 


WFQSPLRQADPGPPRHTLFMDFVAGAIGGVCGDAVGYPLDTVKV 
R IQ TEPKYTG I WHCVRDT YHRBRVWGFYRGLLL PVCT VS LVS SE 
VFGTYRHCLAHICRLRFGNPDAKPTKADITIjSGCASGLVRVFLT 
SPTEVAKVRLQTQTQAQKQQRRLSASGPIiAVPPMCPVPPACPEP 

kyrgplhclatvareeglcglykgssalvlrdghsfatyflsya 
vlcewlspaghsrpdvpgvlvaggcagvlawavatpmdviksrl 
qadgqgqrryrgllhcmvtivreeg?rvlfkglvlnccrafpvn 
mwfvayeavlrlargllt 


5609 


1628 


304 


AKGVWVXjP S PP P RP GRGAIt VSGSGLRRGRSGTS WRPR RMNHKS K 
KRIREAKRSARPELKDSLDWTRHNYYESFSLSPAAVADITVERAD 
ALQLSVEEFVERYERPYKPWLLNAQEGWSAQEKWTLERLKRKY 
RNQKFKCG EDNDG YS VKMKMKYY I E YMESTRDDS PLY I FDSS YG 
EHPKRRKLLEDYKVPKFFTDDLFQYAGEKRRPPYRMPVMGPPRS 
GTGI H I DPLGTS AWNAL VQGHKRW CL PPTST PREL IKVTRDEGG 
NQQDEAITWFNV1YPRTQLPTMPPEFKPLEILQKPGETVFVPGG 
WWHVVI^iDTTIAITQNFASSTNFPVVWHKTVRGRPKLSRKWYR 
ILKQEHPELAVIADSVDIjQESTGIASDSSSDSSSSSSSSSSDSD 
SECESGSEGDGTVHRRKKRRTCSMVGNGDTTSQDDCVSKERSSS 
R 


5610 


54 


1196 


LERTPA3 ADMAWTKYQIiFLAGLMLVTdd t NTLS AKWADN FMAEG 
CGGSKEHSFQHPFLQAVGMFLGEFSCLAAFYLLRCRAAGQSDSS 
VDPQQPFNPLLFLPPAItCDMTOTSIiMYVAIiKMTSASSFQMLRGA 
VIIFTGLFSVAFLGRRLVLSQWLGILATIAGLVWGLADLIiSKH 
DSQHKLSEVITGDLLIIMAQIIVAIQIWtiEEKFVYKHNVHPLRA 
VGTBGLFGFVI LSLLLVPM Y YI PAGS FSGNPRGTLEDALDAFCQ 
VGQQPL I AVALLGNI SS I AFFNFAGIS VTKELSATTRMVLDSLR 
TWIWALSLALGWEAFHALQI LGFLILLIGTALYNGLHRPLLGR 
LSRGRPLAEES EQERLLGGTRTPINDAS 


5611 


2 


577 


FVLPNRLGIPGSTFRGPGACASSSSLAASAKPGAGGSPA1AMSG 
ELSNRFQGGKAFGIiliKARQERRlABINREFLCDQKYSDEEKLPE 
KLTAFKEKYMEFDLNNEGEIDLMSIiKRMMEKLGVPKTHLEMKKM 
I SE VTGGVS DT I S YRDFVNMM LGKRS AVLKLVMMF EG KANE S S P 
KPVGPPPERDIASLP 


5612 


1 


721 


ASKDGYMDAT I APHR I P PEMPQ YGE BNH I FELMQAMWLC KHLNS 
S LLTIiENLI LNE FS YTATEARRLYLQRKT VPSALLVQL I QERLA 
EEDCI KQGWILDGI PETREQALRIQTLG ITPRHVI VLSAPDTVL 
I ERNLGKRIDPQTGE I YHTT FDWPPES E I QNRLM VPE DI S ELE T 
AQKLLE YHRNI VRVI PS YPKI LKVISADQPCVDVFYQALTYVQS 
NHRTNAPFTPRVLL LGPVGS 


5613 


115 


1279 


RGVDPALRRAEKMLPLSIKDDEYKPPKFNLFGKISGWFRSILSD 
KTSRNLFFFLCXiNLSFAFVELIjYGIWSNCLGLISDSFHMFFDST 
AI LAGIiAAS VI S KWRDNDAFS YG YVRAE VLAGFVNGLFLI FTAF 

fifsegveralappdvhherlllvsilgfwnligifvfkhggh 
ghshgsghghshslfngaldqahghvdhchshevkhgaahshdh 
ahghghfhs hdgpslke ttgps rq 1lqg vflhi ladtlgs igvi 
asaikmqnfglm1adpicsiliailiwsvipllresvgilmqr 
tppllenslpqcyqrvqqlqgvyslqeqhfwtlcsdvyvgtlkl 

I VAPDADARW ILSQTHNI FTQAG VRQLYVQ I DFAAM 


5614 


3 


1268 


LLSRNEHACPLQAGLGLTQRKPKAIRGREGRATNQGQGETQNER ' 
APWGARQRLG VMAELQQLQEFEI PTGREALRGNHSALLRVAD YC 
EDN YVQATD KRKALEETMAFTTQAIAS VAYQ VGNLAGHTLRMLD 
LQGAALRQVEARVS TXGOMVNMHMEBCVARREIGTLATVQRLPPG 
QKVI APENLPPLTP YCRRPLNFGCLDD I GHG IKDLSTQLSRTGT 
LSRKS IKAPATPASATLGRP PRIPEPVHLPWPDGRLSAASSAS 
SLASAGS AEGVGGAP TP KGQAAPPAPPLPS S LDPPPPPAAVE VF 
QRPPTLEELSPPPPDEELPLPLDLPPPPPLDGDELGIiPPPPPGF 
GPDEPSWVPAS YLEKWTLYPYTSQKDNELS FS EGT V I CVTRR Y 
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P=Proline, Q=Glutaraine, R=Arginine, 
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W=Tryptophan, Y=Tyrosine, X«Unknown , *«Stop 
Codon, /=possible nucleotide deletion P 
\*possible nucleotide insertion) 








SDGWCEGVSSEGTGFFPGNYVEPSC 


5615 


9 


1558 


AliGRRRPGDPREMEAAATPAAAGAARREELDMDVMRPLINEQNF 
DGTSDEEHEQELLPVQKHYQLDDQEGI S FVQTLMHLLKGNIGTG 
LLGLPLAIKNAGIVLGPISLVFIGIISVHCMHILVRCSHFLCLR 
FKKSTLGYSDTVS FAME VS PWSCLQKQAAWGRS WDFFIjVITQIj 
GFCS VYI VFLAENVKQ VHEG FLES KVFI SNS TNSSNPCERRS VD 
LRI YMLCFLPF1 1 LLVFIRELKNLFVLSFIiANVSMAVSLVT I YQ 
YWRNKPDPHNLPIVAGWKKYPLFFGTAVFAFEGIGWLPLENQ 
MKESKRFPQALNIGMG I VTTL YVTLATLGYMCFHDE I KGS ITLN 
LPQDVWL YQS VKI I* YS FG I FVTYS IQFYVPAE I IIPGITSKFHT 
KWKQI CEFGIRSFLVS I TCAGAI LI PRLD I VI SFVGAVSSSTLA 
LILPPLVEILTFSKEHYNIWMVLKNISIAFTGWGFLLGTYITV 
EE IIYPTP KWAGTPQS PFLNLNSTCLTSGLK 


5516 


1 


719 


DDFVROGPQSAAMGASARLLRAVIMGAPGSGKGTVSSRITTHFE 
LKHLSSGDLIiRDNMLRGTEIGVLAKAFIDQGKLIPDDVMTRLAL 
HELKNLTQYSWLLDGFPRTLPQAEALDRAYQIDTVINLNVPFEV 
I KQRLTARWIHPASGRVYNI EFNPPKTVG IDDLTGE P L I QREDD 
KPETVIKRLKAYEDQTKPVLEYYQKKGVLETFSGTETNKIWPYV 
YAFLQTKVPQR3QKASVTP 


5617 


176 


765 


PWRGRGSRPRGAGAMAEEQVNRSAGLAPDCEASATAETTVSSVG 
TCEAAGKSPE PrCD YDS TCVFCR I AGRQD PGTELLHCENEDL 1 CF 
KD I KPAATHHYLWPKKHI GN CRTLR KD Q VE LVENMVT VG KT I L 
ERNNFTDFTNVRMGFHMPP FCSI SHLHLHVLAPVDQLGFLS KLV 
YRVNS YWFI TADHLI EKLRT 


5618 

• 


3 


1692 


YLNYINLKSENKLSGKEDLWEKLQYLWKSTLNLPEDLLRVPDES 
LFLNSGGDSLKS I RLLSEI EKLVGTfl VPGLLE I ILSSS ILEI YN 
H ILQTWPDEDVT FRKS CATKRKLSN INQEBAS GTSLHQKAIMT 
FTCHNEXNAFWLSRGSQILSLNSTRFIjTKLGHCSSACPSDSVS 
QTNIQNLKGLNS PVBI GKS KDPSCVAKVSEEGKPAIGTQKMELH 
VRWRSDTGKCVDASPLWIPTFDKSSTTVYIGSHSHRMKAVDFY 
SGKVKHEQILGDR IESSACVSKCGNF I WGCYNGLVYVLKSNSG 
EKYWMFTTEDAVKSSATMDPTTGHYIGSHDQHAYALDIYRKKC 
VWKSKCGGTVFS S PCLNLI PHHL YFATLGGLLLAVNPATGNVIW 
KHS CGKPLFSS PQCCSQ YICIGCVDGNLLCFTHFGEQVWQFSTS 
GPI FSSPCTS PSEQKI FFGSHDCFI YCCNMKGHLQWKFETTS RV 
YATPFAFHNYNG SNEMLIiAAASTDGKVW I LESQSGQLQSVYELP 
GEVFSSPWLESMLIIGCRDNYVYCLDLLGGNQK 


5619 


2160 


1477 


DSPVLPTSGNVISTAQFAQFWSAVBAALRSLGSPPGAGRGCPCP 
AQSLHSHQLAAWDPLKPSLRSYPPHLLQHPQLRSLTASSGHLGR 
RSCPQPRPLEELLRAGSSTRPQPLTSSCCGMSCMYSFLGHCSVL 
LWGTKGRGSGS PSS PGCCLHPPAQHSQDLPLVHVDVGWQPPLGP 
TVGLRPGLLGERQRGALRAGDPQCQCPLPATVREDLGVPSPWAA 
ECSPPATP 


" £620" 


930 


182 


PLPP PTLAMFLTRSE YDRGVWTFSPEGRLFQVEYAI EAI KLGST ■ 
AIGIQTSEGVCLAVEKRITSPLMEPSSIEKIVEIDAHIGCAMSG 
L I AOAKTL I DKAR VE TQNHWF TY1SIETMT VE S VTQ AVSNXiALQ FG 
EEDADPGAMSRPFGVALLFGGVDEKGPQLFHMDPSGTFVQCDAR 
AIGSAS EGAQSS LQE VYHJKSMTLKEAI KS S L I ILKQVM E EKLNA 
TNIELATVQPGQNFHMFTKEELEEVliCDI 


5621 


3 


619 


VVEFVE YTATDANVKNESLSS VOOLG IKMTVRYGK r .f .Tmrsa 
ENDLTWVLKHCERFLKQQQTS I KSS LLCLQGNYAGHDW FVSSLF 
MIMLGDKEKTFQFLHQFSRLLTSAFLWLPRLHISSYLPNDTVES 
GIHPVYFCSTHYIEMLLKAELPLVFSAFHMSGFAPSQICLQWIT 
QCFWNYLDWIEICHYIATCVFLGPDYQVYICIAVFKHLQQDILQ 
HTQTQDLQVFLKEE ALHGFRVS D YFE YME I LEQN YRT VLLRDMR 
NIRLQST 


5622 


1122 


456 


AASTKDAVSRKRSHSASEKSGTGTSISKRLNMNPQIRWPMKAMY 
PGTF YFQFKNLWEANDRNETWLC FTVEG I KRRS WS W KTG VFRN 
Q VDS ETHCHAERC FLS W FCDDI LS PNTKYQ VTWYTS WS P CPDCA 
GEVAEFLARHSNVNLTI FTARLY YFQYP CYQEGLRS LSQEGVAV 
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SEQ 
ID 
NO: 


Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nuc 1 e o t i de 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A«Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=* Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine , 
P= Proline, Q=Glut amine, R^Arginine, 
S*Serine, T=Threonine, V=Valine, 
W«Tryptophan, Y-Tyrosine, X=Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EIMDYEDFKYCWENFVYNDNEPFKPWKGLKTNFRLLKRRLRESL " 
Q 


5523 


3 


954 


FLP F FIRAPKI SRNGQVJLFTFTTP FPFANKALPGWEGI VPACFW 
RKK I LTPS TGTME LLQ VT I LFL LPS I CSSNSTG VLEAANNSIi W 
TTTKPSITTPNTESLQKNWTPTTGTTPKGTITNELLKMSLMST 
ATFLTSKDEGLKATTTDVRKNDSIISKTVTVTSVTLPNAVSTLQS 
SKPKTETQSSIKTTEIPGSVLQPDASPSKTGTLTSIPVTIPENT 
SQSQVIGTEGGKNASTSATSRSYSSIILPWIALIVITLSVFVL 
VGLYRMCWKADPGTPENGNDQPQSDKESVKLLTVKTISHESGEH 
SAQGKTKN 


5S24 


159 


898 


PGVAAAAGALPQYHGPAPALVSCRRELSLSAGSLQLERKRRDFT 
SSGSRXLYFDTHALVCLLEDNGFATQQAEIIVSALVKILEANMD 
I VYKDMVTKMQQE I TFQQ VMS Q I ANVKKDM 1 1 LEJCSEFS ALRAE 
NEKIKLELHQLKQQVMDEVI KVRTDTKLDFNLEKSRVKB LYS UN 
EKKLLELRTEIVALHAQQDRALTQTDRKIETEVAGLECTMLESHK 

ldnikylagsiftcltvalgfyrlwi 


5625 


1 


1180 


tipssaaacragppagalealspggarahaerrgemratplaap 
ags lsrkkrlelddnldterpvqkrarsgpqprlppcllpls pp 
tap dratavatasrlgf yvlle peeggrayqalhcptgte yt cr 
vypvqeaiiavlepyarlpphioivarptevlagtqllyafftrth 
gdmhs lvrsrhr i pepeaavl frqmatalahchqhglvlrdlkl 
crfvfadrerkklvlenledscvltgpddslwdkhacpayvgpe 
i lssras ysg kaad vwslgval ftmlagh ypfqds e p vll fg k i 
rrga y alp agls aparcl vrcllrrepaerltatg i llhp w lrq 
dpmplaptrs klwe aaq wp dglglde areeegdre wl yg 


5626 


3123 


2011 


PPRALGS VAMENQ VLTPHVYWAORHRE L YLRVELS D VQNPAI S I 
TEMVLHFKAQGHGAKGDNVYEFHLEFLDLVKPEPVYKLTQRQVN 
IT VQKKVS Q WWBRLTKQEKR PLFLAPD FDRWLDESDAEMELRAK 
E EERLNKLRLES EGS PET LTNLRKG YL FM YNLVQFLGFS W I F VN 
LTVRFC ILGKES F YDTFHT VADMM YFCQMLAWET I NAAIGVTT 
S PVLPS L I QLLGRNFILFI I FGTMEEMQNKAWFFVFYLWSAIE 
I FRYS FYMLTC I DMDWKVLTWLRYTLWIPLYPLGCLAEAVS VIQ 
SIPIFNETGRFSFTLPYPVKIKVRFSFFLQIYLIMIFLGLYINF 
RHLYKQRRRRYGQKXKKIH 


5627 


3123 


2011 


PPRALGS VAMENQ VLTPHV YWAQRHRE L YLRVELS D VQNPAI S I 
TENVLHFKAQGHGAKGDNVYEFHLEFLDLVKPEPVYKLTQRQVW 
ITVQKKVSQWWERLTKQEKRPLFLAPDFDRWLDESDAEMELRAK 
EEERLNKLRLESEGS PETLTNLR KG YLFM YNLVQFLGFS WI F VN 
LTVRFC I LGXES FYDTFHTVADMM YFCQMLAVVETINAAIGVTT 
SPVLPSLIQLLGRNFILFIIFGTMEEMQNKAWFFVFYLWSAIE 
I FRYSFYMLTCIDMDWXVLTWLR YTLWI PLYPLGCLAEAVSVIQ 
S I PIFNETGRFSFTLPYPVKI KVRFSFFLQI YLIMIFLGLYINF 
RHL YKQRRRR YGQKKKKI H 


COB 


75 


1455 


VAGAMAS KCLKAGF S SGS LKS PGGASGG STR VS AM YS S S P CKL P 
S L S P VARS FS ACSVGLGRS S YRATS CLPALCL PAGGFATS YSGG 
GG WFGEG I LTGNEKE TMQS LNDRLAGYLE KVRQLEQ ENAS LES R 
IREWCEQQVPYMCPDYQSYFRTIEELQKKTLCSKAENARLWEI 
DNAKLAADDFRTKYETEVSLRQLVESD1NGLRRILDDLTLCKSD 
LEAQVESLKEELLCLKKNHEEEVNSLRCQLGDRLNVEVDAAPPV 

DTiWPVTtF , PMPr t rtVT?TT.VP , MWDtJTMlirT>Mr TVPrtCPPT M/VMnr^peon 

"Uimv uGiEAL w it{\~\j i a ijj v&wwKKJJABilJVVijLri yo£#JlLiMyy VVSSSE 
QLQSCQAEI I ELRRTVWALEIELQAQHSMRDALESTLAETEARY 
SSQLAQKQCM I TNVEAQLAEIRADLERQNQEYQVLLDVRARLEC * 
EINTYRGLLESEDSKLPCNPCAPDYSPSICSCLPCLPAASCGPSA 
ARTNCSARPICVPCPGGRF 


5629 


2287 


938 


GRPRSS SDNRNFLRERAGLSSAAVQTR IGNSAAS RRS PAARP PV 
PAPPALPRGRPGTEGSTSLSAPAVLWAVAVVVVVVSAVAWAMA 
NYIHVPPGSPEVPKLNVTVQDQEEHRCREGALSLLQHLRPHWDP 
QEVTLQLFTDG I TNKLIGCYVGNTMEDWLVRI YGNKTE LLVDR 
DEEVKSFRVLQAHGCAPQLYCTFNNGLCYEFIQGEALDPKHVCN 
PAI FRLIARQLAKIHAIHAHNGWIPKSNLWLKMGK YFSLI PTGF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A-Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, l=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine , 
P=Proline, Q=Glutamine, R=Arg-inine, 
S^Serine, T«Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








ADEDINKKFLSDIPSSQILQEEMTWE^KEILSNLGSPVVLCHNDL 
LCKWIIYKEKQGDVQPIDYSYSGYWYLAYDIGNHF'NEFAGVSDV 
DYSLYPDRELQSQWLRAYLEAYKEFKGFGTEVTEKEVEILFIQV 
NQFALASHPFWGLWAL I QAK YS TI EPD F3U3 YAI VRFNQ YFKMKP 
EVTALKVPE 


5630 


1194 


278 


GFWA I AQTCAHH h 9 PGS PWLVPAS PWRLPEMS S FGYRTL TVALF 
T^ICCPGSDEKVFEVHVKPKKIAVEPKGSLEVNCSTTCNQPEVG 
GLETSLDKI LLDEQAOWKH YLVSNI SHDTVLOCHFTr Qr« Y rurcw 
KSNVSVYQP PRQVILTLQPTLVAVGKS FTIECRVP TVEPLDSLT 
LFLFRGNETLHYETPGKAAPAPQEATATFNSTADREDGHRNFSC 
LAVLDLMSRGGNI FHKHS A P KMLE I YE PVS DSQMVI I VTWSVL 
LS LFVTS VLLCFI FGQHLRQQRMGTYGVRAAWRRLPQAFRP 


5631 


1053 


290 


srvddfvrpepsraepsrsgrrrparraatmSvfgklfgagggk 
agkggptpqeaiqrlrdteemlskkqepuskkieqeltaakkhg 
tknkraalqalkrkkryekqlaqidgtlstiefqrealenantn 

taiskpvgfgeefdedelmaet.eeiieqeeldknlijeisgpe'rvp 
lpkvps ialpskpakkkeeedddmkelenwagsm 


5632 


3 


952 


VVIK»WSPPRRLWWGSIiGAA01?PAVPVSRtABc;T.Tn/T? , roT3Dutst5» — 
SVRVARGRLGVWAQPQPLLPRPVGSRREMQPPGPPPAYAPTNGD 
FTFVS SADAEDLSGS I ASPDVKLNLGGDFI XESTATTFLRQRGY 
GWLLEVEDDDPEDNKPLLEELDIDLKDIYYK1RCVLMPMPSLGF 
NRQWRDNPDFWGPLAWLFFSM ISLYGQFRWS WI I TI WI FGS 
LTI FLLARVLGGE VAYGQVLGVIGYS JjLPtiXVIAP VLLWGS FE 

WSTLIKLFGVFWAAYSAAS LLVGEEFKTKKPLLI YPI FLLYI Y 
FLSLYTGV 


5633 


771 


460 


QGCS KTM5 VGR P F YRSS E FME QLLSS HLHQ VP F FCCFTWCLCN 
CLFENSVSKLYMLCFNFFMS I FFYSLS ITKLNLI YLWGLSYQSL 
tiLLLLSGHRPWGSSMV 


5634 


1446 


855 


PRATGR I RSRAAAS R PRAGAG ASGAE PRSGRERSRLS GRRAPAM 
ARNTLS SRFRR VDI D EFDENKFVDEQEEAAAAAAEPGPDPS EVD 
GLLRQGDMIiRAFHAALRNS PVNTECNOAVJrERAnR uvt ,mn .tmt?it 

sseieqavqsij:rngvdllwkyiykgfekptenssavlu3whejc 
alavgglgs i irvltarktv 


5635 


3 


• 943 


drgprstatdtgrar vs fwrfpldpgvkn^nvqi sgekrrfrtii ~ 

rslfhpfpvtrsgapravlvgsswpakmvapavkvargwsglal 

gvrravlqlpgltqvrwsryspefkdplidkeyyrkpveeltee 

ekyvrelkktqlikaapagktssvfedpviskftnmmmiggnkv 

larslmiqtleavkrkqfekyhaasaeeqatiernpytifhoai* 

kncepmiglvpilkggrfyqvpvplpdrrrrflamkwmitecrd 

KKHQRTLMPEKLSHKLLFJ^lINOGPVTKRKMnT.HTnvlA , paMT?nT a 
HYRWW 


563 6 


2253 


1143 


LEDTICQHPPAEKKLYI,YHRKLREVER^GIPRLPKDVFMDTHQG 
LTDVRAKVTGFSEG WDS VKGGFS S FSQATHSAAGAWS KPRE I 
ASIilRNKFGSADNIPNLKDSLEEGQVDDAGKALGVISNFQSSPK 
YGS EEDCS SATSGSVGANSTTGGI AVGAS S S KTNTLDMQS SG FD 
AL LHE I Q E I RETQARLE ES FE TLKEHYOJRD YSLI MQTLQ EER YR 
CERLEEQLNDLTELHQNE I LNLKQ ELASME EK I AYQS YE RARDI 
QEALEACQTR I S KMELQ QQQQQ WQLEGLEWATARNLLG KLIWI 
LLAVMAVLLVFVSTVANCWPLMKTRNRTFSTLFLVVFIAFLWK 
HWDALFS YVERFFSS PR 


5637 


94 6 


2532 


MSFCGARANAKMMAAYNGGTSAAAAGHHHHHHHHLPHLPPPHLH" 
HHIfflPQHHLHPGSAAA\mPVQQHTSSAAAAAAAAAAAAAMIiNPG 
QQ0PYFPS PAPGQAPG PAAAAPAQVQAAAAAT VKAHHHQHSHH P 
QQQLD I EPDRP I G YGAFG VVWS WD PRDGKRVALKKM PNVFQNL 
VSCKRVFRELKMLCFFKHDNVLSALDILQPPHIDYFEEIYVVTE 
LMQSDLHKI I VS PQPLSSDHVKVFLYQILRGLKYLHSAGI LHRD 
I KPGKLLVNSNCVLKI CD FGLARVEELDESRHMTQEWTQ YYRA 
PEILt^GSRHYSNAIDIWSVGCIFAELLGRRILFQAQSPIQQLDL 
2 TDLLGTPSLEAMRTACEGAKAH I LRG PHKQPS L P VLYTLSS QA 
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SEQ 
XD 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A*Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Apt H FaPVitaTrvl j»1 s»r%< «o r*— f»n tm -i 
H^Histidine, I^Isoleucine, K«Lysine, 
L=» Leucine, M*=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R^Arginine, 
S-Serine, T=Threonine, V^valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








THEAVHLLCRMLVFDP YKR ISAKDAIAH P VT.DPnp r.T? vrnvMntr"" 
CCFSTSTGRVYTSDFEPVTNPKPDDTPBKNLSSVRQVKE I IHQP 
ILEQQKGNRVPLCINPQSAAFXSFISSTVAQPSEMPPSPLVWE 


5638 


125 


1155 


DRKMSELDQLRQEAEQLKNQIRDARKACADATLSQITWWIDPVG 
RIQMRTRRTLRGHLAKIYAMHWGTDSRLLVSASQDGKLIIWDSY 
TTETKVHAIPLR£S WVMTCAYAP SGNYVACGGLDNI CS I YNLKTR 
EGNVRVSRELAGHTGYLSCCRFLDDWQIVTSSGDTTCALWDIET 
GQQ7 TTFTGHTGDVMS LS LAPDTRLFVS GACDAS AKLWDVREGM 
CRQT FTGHES D I MAI CFF PNGNAFATGSDDATCRLFDLRADQEL 
KTYSHDNIICGlTSVSFSKSGRLLLAGYDDFNCNVWnALKADRA 
GVLAGHDNRVS CLGVTDDGMAVATGS WDSFLKIWN 


5639 


125. 


1155 


DRKMSELDQLRQEAEQLKNQIRDARKACMATLSQITmilDPVG 
R I QMRTRRTLRGHLAKI YAMHWGTDSRLLVSASQDGKLI I WDSY 
TTNKVHAIPLRSSWVMTCAYAPSGNYVACGGLDNICSIYNLKTR 
EGNVRVSRELAGHTG YLS CCRFLDDNQI VTS SGDTTCALWD I ET 
GQQTTT FTGHTGD VMS LS I*APDTRLFV3GACDASAKXiWD VREGM 
CRQTFTGHESDIUAICFFPNGNAFATGSDDATCRLFDLRADQEL 
MTYSHDNI I CG I TS VS FSKS GRLLLAG YDDPNCNVWDAL KADRA 
GVLAGHDNRVS CLG VTDDGMAVATGS VI DS FLKI WN 


5640 

i 


2B0 


1092 


QQGNKKTMLSHKTMMKQRKQQATAI M KE VHGND VDGMDLGKKVS 
I PRDIMLEELSHLSNRGARL FKMRQRRS DKYTFEN FQYQS RAQI 
MHS I AMQNGKVDGSNLEGG S QQAPLTP PNT F DPRSP PNPDN IAP 
GYSGPLKEX P PEKFNTTAVPKYYQSPWEQAI SNDPELLBALYPK 
LFKPEGKAELPDYRSFNRVATPFGGFEKASRMVKFKVPDFELLL 
LTD P R FMS FVNPLS GRRS FNRTPKG W I S EN I P I VITTEPTDDTT 
VPESEDL 


5641 


27 


332 


UKHWUlYLrDVKliljSNQMDKLF^ IQKL I QA 
EIILSDNSSILVLENNFLFKVKSKQFIHLIAKKFYISITIVSAS 
NGESFVLSMIVTG 


5642 


199 


1247 


I TPCRMDFLVL FL F YLAS VLMGLVL I C VCSKTHS LKGLARGGAQ 
IFS CI IPECLQRAMHGLLHYLFHTRNHTFIVLHLVLQGMVYTEY 
TWEVFGYCQELELSLHYLLLPYLLLGVNIjFF FTLTCGTNPG 1 1 t 
KANELLFLHVYE FDEVMFP KNVRCSTCDLRKPARS KKCSVCNWC 
v hk tri J ri n L. v w ViM si (_ J. t*AWN I R x FL 1 x VLTLTASAATVAI VSTTF 
LVHLWMSDLYQETYIDDLGHLHVMDTVFLIQYLFLTFPRIVFM 
LGFVVVLSFLLGGYLLFVLYLAATNQTTNEWYRGDWAWCQRCPL 
VAWPPSAEPQVHRNIHSHGLRSNLQEIFLPAFPCHERKKQE 


5643 


1 


847 


PSGGVRDVETRGPGSRAARG PRWMERRGVGAGAIAKKKLAEAK 
YKERGTVIAEDQIiAQMSKQLDMFKTNLEEFASKHKQEIRKNPEF 
RVQ FQDMCATIGVDPLASGKGFWSEMLGVGDFYYEIiGVQII EVC 
LALKHRNGGLITLEELHQQVLKGRGKFAQDVSQDDLIRAIKKLK 
JMJ* L\se l?i if V<jfc»\i. x XjXQS VPA£liNMDHTVVTjQIaAEKtfGYVTrVS 
EIKASLKWETERARQVLEHLLKEGLAWLDLQAPGEAHYWLPALF 


5644 


S3 


1138 


PRRMGSWVQLITSVGVQQNHPGWTVAGQFQEKKRFTEEVIEYFQ - 
KKVSFVHLKILLTSDEAWKRFVRVAELPREEADALYEALKNLTP 
YVA IED KDMQQKEQQ FREW FLKE FPQ I RW K IQE S I ERLR VIANE 

GLGIASATAGIAS S I VENTYTRSAELTASRLTATSTDQLEALRD 
ILHDITPNVLSFALDFDEATKMIANDVHTLRRSKATVGRPLIAW 
RYVP INWE TLRTRGAPTR I VRKVARNLG KATSGVLWLDWNL 
VQDSLDLHKGE KS E S AELLRQWAQELEE WLNELTH IHQSLKAG 


5645 


537 


799 


VQS VR DLKR LS PTD P PGDSGNRD VTRED P VTGP LNS AS SQ VPTL 
YLCLQNSLLGHSSVEDARATMELYQISQRIRARRGLPRLAVSD 


5646 


3745 


3328 


AEQYGTSPHLLPTMLLSSCLPPANVTTKAATPPPLVLSLTTADP 
AGKPAPCRVTLTLLRAS I PATKRAS FLSS FI KMFFEELE YILGF 
LSLLKFHVHVS VYSAI CHFQKEGTGNSRS FTCTPELFPRLQTHL 
RAEGGAQ 


5647 


288 


800 


GVIMATSELSCEVSEENCERREAFWAEWKDLTLSTRPEEGCSLH 
EEDTQRHET YHQQGQCQVLVQRSPWLMMRMG I LGRGLQE YQLP Y 
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SEQ 

J. U 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R=Axginine, 
S*Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyro sine, X=Unknown, **=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








QRVLPLPIFTPAKMGATKEEREDTPIQLQELLALETAIiGGQCVD" 
RQE VAE ITKQLP P WPVSKPG ALRRSLS RSMSQEAQRG 


5648 


7 


1518 


VLS ELCGRHEALREVGAEWPP PTCSPKI CSGLQQAGNTDWSLTM 
APQSLPS SRMAPLGMLLGLLMAACFTF CLSHQNLKE FALTN PE K 
SSTKETERKETKAEEBIjDAEVLEVFHPTHEWQALQPGQAVPAGS 

hvrlnlqtgereaklqyedkfrunlkg krld intntytsodlks 

ALAKFKEGAEMESSKEDKARQAEVKRLFRPIEEIjKKDFDEIiNW 

ietdkqimvrlinkfnsssssleekiaalfdleyyvhqmdnaqd 

LLSFGGLQWINGLNSTEPLVKEYAAFVIjGAAFSSNPKVQVEAI 
EGG ALQKL L V I LAT EQ PLTAKKKVL FALCS LLRHF P YAQRQFL K 
LGGLQVLRTLVQEKGTEVIAVRWTLLYDLVTEKMFAEEEAELT 
QEMSPEKLQQYRQVHLLPGLWEQGWCEITVHLLALPEHDAREKV 
LQTLGVXiLTTCRDRYRQDPQLGRTLASLQAEYQVLASLELQDGE 
DEGYFQELLGSVNSLLKELR 


5649 


1172 


3006 


KLQEQLDAINEE I RM I QEE KES TE LRAEE I ETR VTSGSME ALNI* 
KQLRKRGSIPTSLTDLSLASASPPLSGRSTPKLTSRSAAQDLDR 
MGVMTLPSDLRKHRRKLLS PVSREENREDKATI KCETSP PS SPR 
TLRLEKLGHPALSQEEGKSALEDQGSNPSSSNSS^DSLHKGAKR 
KG IKS S IGRLFGKKEKGRL IQLSRDGATGHVLLTDSEFSMQEPM 
VPAKLGTQAE KDRRLKKKHQLLEDARR KGM P FAQWDGPT WS WL 
ELWVGMPAWYVAACRANVKSGAIMSALSDTE 3 ORE IGISNALHR 
LKIiRIiA IQEM VS I/TSP S AP PTSRT S SGNVWVTHEEMETLETS TK 
TDSEEGSWAQTLAYGDMNHEWIGNEWLPSIvGLPQYRSYFMECIiV 
DARMLDHLTKKDLRVHliKMVDSFHRTSLQYGI MCIiKRQNYDRKE 
LEKRREESQHE I KDVTWWTNDQVVHWVQS IGLRDYAGNLHES GV 
HGALLALDENFDHNTLALILQ I PTQNTQARQVMERE FNNLLALG 
TDRKLDDGDDKVFRRAPSWRKRFRPREHHGRGGMLSASAETLPA 
GFRVSTliGTLQPPPAPPKKIMPEAHSHYTiYGHMLSAFRD 


5650 


1172 


3 006 


MLQEQLDAI NEE I RM I QEEKE STELRAEE I ETR VTSGSME ALNL 
KQLRKRGS I PTSLTDLSLASAS PPLSGRSTPKLTSRSAAQDLDR 
MGVMTLPSDLRKHRRKLLSPVSREENREDKATIKCETSPPSSPR 
TLRLE KLGHPALS OJGEGKS ALEDQGSNPSS SNSS QDSXjHKGAXR 
KGI KSS IGRLFGKKEKGRLIQLSRDGATGHVLLTDS EFSMQEPM 
VPAKLGTQAEKDRRLKKKHQLLEDARRKGMPFAQWDGPTWSWL 
ELWGMPAWYVAACRANVKSGAlMSAtiSDTEIQREIGISNALHR 
LKLRLAIQEMVSLTSPSAPPTSRTSSGNVWVTHEEMETLETSTK 
TDSEEGS WAQTLA YGDMNHE WIGNEWLPS LGLPQYRS YFMECLV 
DARMLDHLTKKDLRVHLKMVDS FHRTS LQYGIMCIiKRLN YDRKE 
LE KRREES QHE I KD VIiVWTNDQ VVHWVQS IGLRD YAGNLH ESGV 
HGALLALDBNFDHNTLALI LQI P TQNTQARQ VMEREFNNLLALG 
TDRKLDDG DD KVFRRAP S WR KRFR P REHHGRGGM LS A S AE TLP A 
GFR VSTLGTLQ PP P APPKKIM P EAHSH YLYGHMLSAFRD 


5651 


646 


1869 


ARQGQRQPWG*EARAKGPASESPRV*EGSGWEGPASP*TPGSTL 
AWGEGAG I R* ASGLTAAG AAS AAAA/ PP PTRGG PAPAG CGRAPP 
WPAPLRVPTHGRAPAPRS RAAPRAPAtiSHGTAAAALSPAS PAGP 
ADP + LPGHSSQS PPRG * RWGRS RS APAPAHPEH PAPAGSAS ASQ 
QTPGWPGS CCLAQGWQAE PLGAPGAEDG \ PVPPQRG FPLGTIjGS 
P AGS WAGLAG YG * AGAP GTQATAPRAAGQT P VAAAPNCR V*GS A 
PALHRAPAAADPGS PLQAPPRAWAS PAAAG PGLS SSDYCGGLGA 
G WRAGT SP R Til .BAAGLfi DWWARf?!^ Dfl Pi R *CXZfi DnfPT T D&ca 

CMPSPPVEGSLGLSRKGHGDLPSQAR*GWHECRRARHLVPLPRL 
LGPRGRTGRPSSPS 


5652 


735 


343 


HHKKYQHIHQKSFSCPEPACGKSFNFKKHLKEHMKLHSDTRDYI 
CE FCARS FRTS SNL VI HRR I HTGE KPLQCE I CGFTCRQKAS LNW 
HQRKHAETVAALRFPCEFCGKRFEKPDSVAAHRSKSHPALLLA 


5653 


66 


1401 


RGRLQSRGRLTLGLVLLLLDILGARQHGQRVSHGWKGGFLTAPL 
CFPQPCQPGTRRGRRRSIiKEATEPQLAMAEEFVTLKDVGMDFTL 
GDWEQLGLEQGDTFWDTALDNCQDLFLLDPPRPNLTSHPDGSED 
LEPLAGGS PEATS PDVTETKWSPLMEDFFEEGFSQE I/SRD VI Q 
GWLLE LQFRRS L YRGHLVR* FARRS RKS SEV * YCHQRGKS HGMQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, 1= I so leucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=:proline, Q^Glut amine , -R^Arginine , 
S^Serine, T=Threonine, V«Valine, 
W= Tryptophan, Y«Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ES* I KERTQS CVHRFHGRRFHG\DNVSKKTLTPAKSKEYRGEPF 
S YSDHSQQDS VQEGEKP YQCSE CGKS FSGS YRLTQHW I THTRE K 
PTVHQECEQGFDRKASHSGYPKTHTGYKFYVCNEYGTPFSQSTY 
LWHQKTHAGEKPCKSQDSDHPPSHDTQSGEHQKTHTDSKSYNCN 
E OGKAFTRI FHLTRHQ K IHTRKR YE CS KCQATFNLRKHLI QHQ K 
THAANV 


5654 


3 


598 


TLPLFPGRRFRGWRRCGAVAARKNSTGGNVSINQRRDSVRMSAIi 
NWKP FVYGGLAS ITAECGTFPI DLTKTRFQ IQGQTNDAKFKEI I 
YRGMLHALVR I GREEGLKALYS G * VGLHAFLCHCSLFHMG IDFR 
PRLHRSQVKSLRCV* KEQIA* */MFSLLISTLISKYIYYAADVL 
EKLFYYIQVQTDNNKKICLFKNI 


5655 


2 


867 


RPPGIRAPRQLHPAAGRRPDASARPRFRPTVLIiHDPFQLSFPPP 
PLSYPSVFPAVARVLPQRSGDYRAAGMPQLSGGGGGGGGDPELC 
ATDEMI PFKDEGDPQ\REKI FAEI VNPEEEGDLADI KSSLVNES 
E 1 1 PASNGHE VARQAQTSQ E P YHDKAREHP DDGKHPDGGL YNKG 
PSYSSYSGYIMMPNMNNDPYMSNGSLSPPIPRTSNKVPWQPSH 
AVHP LT P LI TY S DEH FS PGSHPSHI PSDVNS KQGMSRHPPAPDI 
PTFYPLSPGGGGQITFPLGWQGQP 


5656 


228 


1066 


PRRVPPLPE FASGPGAAFFHSGRLQRSEiTKDSAGCFSQCRS RAM 
IjVLRS GltTKALASRTIiAPQ VCSS FATGPRQ YDG TF YEFRT Y YLK 
PSNMNAFMENLKKN IHLRTS YSELVGFWS VE FGGRTNKV FH I WK 
YDKFPHRAEVRKAIiANCKEWQEQS 1 1 PNIiARIDXQETE ITYLI P 
WSKLQKPP KEGVYEIjAVFQMKPGG PALWGDAFERAINAHVNLGY 
TKWG VFHTE YGELNRVHVLWWNE SADS RAAVRHKSHEDP I S WG 
GVRE S VNYL \ VSQQNM 


5657 


105 


1052 


GQRLQSPRVQMPVQPPSKDTEEMEAEGDSAABMNGEEEESEEER 
SGSQTESEEESSEMDDEDYERRRSECVSEMLDIjEKQFSELKEXL 
FRERIjS QLRLRLEE VGAERAPE YTEPIiGGLQRS liKI RIQVAG I Y 
KGFCLDVIRKKYECELQGAKQHLESEBCLIiLYDTLQGELQERIQR 
LEEDRQS LDLS S E WWDDKLHARGS SRS WDSLP PS KRKKAPLVS G 
PYIVYMLQEIDIXiEDWTAIKKARAAVSPQKRKSD\DLDPAVHSQ 
GDPQSSWHCTQDSRLPPADRRTHRPLRVCPARLLWCCWALPLHIi 
ALVWTPPL 


5658 


2346 


3541 


TERRVYN P WPEPDP D\ CI QEDP WNJjPNS I KTLVDN I QR Y VEDGK 
NQLLIiALLKCTDTEIiQLRRDAI FCQAIiVAAVCT FS EQLLAALG Y 
RYNNWGEYEESSRDASRKWLEQVAATGVLLHCQSLLSPATVKEE 
RTMLEDIWVTLSELDNVTFSFKQLDENYVANTNVFYHIEGSRQA 
LKVIFYLDSYHFSKLPSRLEGGASLRLHTALFTKVLENVEGLPS 
PGSQAAE DLQQD INAQSLE KVQQYYRKLRAFYI*ERSNLPTDAS T 
TAVKIDQL I RP INALDELCRLM KSFVHP KPGAAGS VGAGL I P I S 
SELCYRLGACQMVMCGTGMQRSTLSVSLEQAAIIiARSHGLLPKC 
I MQATD IMRKQG PRVEIIABtKLR VKDQM PQGAPRLYRLCQPKMN 
GDL 


5659 


2 


696 


WKRSGEVSP KGELGAWRGNSGRPKI IGRAAEAENEDRTLGRIiLP 
GNERSQPRS PLRLLAPQLKAEAAADKGbAPVPPPFS SGHSGPC\ 
EREGEGQRGRGRSRRGAHLELKPSPGLRAGAPTDRGRGGPAEVA 
AAGGRRMVQKESQATLEERESELSSNPAASAGASLEPPAAPAPG 
EDNPAGAGG \ AAVAGAAGGARRFIjCG WEG FYGRP WVMEQRKEL 
FRRLQKWELNTYL 


5660 


229 


853 


PVTMWAFSELPMPLLINLIVSIXGFVATVTLIPAFRGHFtAARL " 

CGQDLNKTSRQQIPESQSVISGAVFLIILFCFIPFPFLNCFVKE 

QRKAFPHHEFVALIGALLAICCMIFLGFADDVLNLRWRHKLLLP 

TAASLPLLMVYFTNFGNTTIWPKPFRPIIjGIiHLDIjGR*SYHCC 

PYGTYFREPFLVLHILLQVFLFCLCVFPDPFW 


5661 


2 


473 


LNLYPSPCGGIPKLPGLPREAAAALGASFLAEAPLPVTVRGSGL 
AGMAVTCD P KAFLS I CFVTtiVFLQLPLAS I CQN* GTDS CAS RG K 
ADFDVTGPHAPII.AMAGGHVELQCQLFPNISAEDMEIiRWYRCQP 
SLAVHMHERGMDMDGEQKWQYRGRT 


5662 


2 


1318 


LRKEGRCRRGSNRGVWAAPAEGLGGRGMLGVRCLLRSVRFCSSA 
P FPKHKPSAKLS VRDAIiGAQNASGERI KIQGWIRSVRSQKEVLF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid. E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W -Tryptophan, Y= Tyrosine, X=Unknown , *=Stop 
Codon, /=poseible nucleotide deletion, 
\=possible nucleotide insertion} 








LHVNDGSSLESLQVVADSGIjDSRELTPGSSVEVQGQLI ks pskr 
QNVELKAEKI KVIGNCDAKDFPI KYKERHPLEYLRQYPHFRCRT 
NVLGS I LR IRS EATAA IHS FFKDSG FVH I HTPI I TSNDS EGAG E 
LFQLEPSGKLKVPEENFFNVPAFLTVSGQIiHLEVMSGAFTQVFT 
FGPTFRAENSQSRRHIiAE F YMIEAE IS F VDS LQDLMQV I E E LFK 
ATTMriVLSKCPEDVELCHKFIAPGQKDRL*HMLKNNFIiIISYTE 
AVE I LKQAS QNFTFTPE WGADLRTEHEKYLVKKCGNT PVFVINY 
PLTLKPFYMRDNEDGPQEl»EGSVA*HSIiGLMILLSIVVIGQP 


5663 


119 


698 


PADIGRSTAKTPGPFRSLEMDDPRYGMCPLKGASGCPGAERSLL 
VQSYFEKGPIiTFRDVAIEFSLEEWQCLDSAQQGLYRKVMLENYR 
NLVFLGIALTKPDLITC^EQGKBPWNrKRHEMVAKPPVICSIIPP 
QDLWAEQDIKDSFQEAILKKYGKYGHANFQLQKGCKSVDECICVH 
KEEIDNKLNQCLI PKKKK 


5664 


118 


572 


SLSMESNHKSGDGLSGTQKEAALRALVQRTGYSLVQENGQRKYG 
GPPPGWDAAPPERGCEIFIGKLPRDLFEDELIPLCEKIGKIYEM 
RMMMD FNGNNRGYAFVTFSNKVEAKNAIKQLNNYEI RNGRLLGV 
CAS VDNCRLFVGGI PKTKK 


5665 


347 


702 


WQHL I ILLHCERTSPAMITS ELPVLQDSTNETTAHSDAGSELE 
ETE VKGXRKRGRPGRP P STNKKPRKS PGEKSRI EAGIRGAGRGR 
ANGHPQQNGEGEPVTLFEWKJjGKSAMQRC 


5666 


213 


540 


VSCLPTS CKMITIiNNQDQPVPFNSSHPDEYKIAALVFYSCI FI I " 
GLFVNITALWVFSCTTKKRTTVTIYMMMVALVDLIFIMTLPFRM 
FYYAKDEWPFGEYFCQI LGA 


5667 


1 


695 


HPL PS ASLGLP SVS LG VSLC VRS ALIiEAVVPML P KRRRAR VGSP 
SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFLTGLARSKGFR 
VLDACSSEATHVVMEETSAEEAVSWQERRMAAAPPGCTPPAIjUD 

i swltes lgagq p vpve crhrbevagp s kg pls p awmpayacqr 
ptplthhntglsealeilaeaagfbgsegrlltfcraasvlkal 
pspvttlsqlq 


5663 


691 


894 


csflfcipdlflqfllgrkeeeavlvggewspsldgldpqadpq 
vlvrtaircaqaqtgidlsgctkw 


5669 


407 


1 


DSGAPEGLSPLMSTQEGLSMHAHFQAYTPFIYIiHARKRRGEIGD 
ADSRFNDR YAHKSAQL YFL YFVCW I FQDVYYFTI KEKNHFFFPK 
ARG AP TKYSGS P I GS PTTTP PTRP PS FNIiHPAPHLLASMQIjQ Kit 
NSQ 


5670 


3 


373 


ssecltmawiplllpllilctvsvasyelaqpssvsvspgqtak 
itcsgdvlakkyarwfqqkpgqapvlvi ykdterpsg i perfsg 

STSGTTVTLTISGAQVEDEADYFCYSATDNFLWVF 


5671 


280 


524 


KFPPKKTPPHLGMESAITLWQFLLQLLLDQKHEHLICWTSKDGE 
FKliLKAJOO^LWGIjRKl^TNMNYDKiSRALRLLFMT 


5672 


2 


557 


FVPATPD PGVWLPPSRDPAMAKRSSTjYI R I VEGKNLPAKDITGS 
SDP YCI VKVDNBP 1 1 RTATVWKTLCPFl^GEE YQVHLPPTFHAVA 
FYVMDEDALSRDDVIGKVCLTRDTIASHPKGKFSLPSHTGLPSP 
WPPSHSETS PLGS VWS PAQGKP FLLS PEAGATFCT PGLCSAACS 
QAWLLLPLP 


5673 


327 


696 


I TVADQ I SHWSAGRIKNRTRI PECIHSSAATTLAGPHTMEGES V " 

KLSSQTLIQAGDDEKNQRTITVNPAHMGKAFKVMNELRSKQLLC 

DVMIVAEDVEIEAHRVVLAACSPYFCAMFTGDMS 


5674 


17 


984 


GGGSMEGESTSAVLSGFVLGAIAFQHLNTDSDTEGFLLGEVKGE 
AKNSITDSQMDDVEVVYTIDIQKYTPCYQLFSFYNSSGEVNEQA 
LKKILSNVKKNVVGWYKFRRHSDQIMTFRERLLHKNLQEHFSNQ 
DLVFLLLTPS IITES CSTHRLEHSLYKPQKGbFHRVPLWAITLG 
MSEQIiGYKTVSGSCMSTGFSRAVQTHSSKFFEEDGSLKEVHKIN 
EMYASLQEELKS ICKKVEDSEQAVDKLVKDVNRLKRE IEKRRGA 
Q IQAAREKNIQKDPQENI FLCQALRTFFPKS EFLHS CVMSLKID 
MFLKVAVTTTTISM 


5675 


80 


753 


EGSRRGPTRLARLSARAGRLHFPPGFSSRIiIHFRGVSECRRPPG 
KS GVPVSAPGS DGKWWEERPGMFS LMAS CCGWFKRWRE P VRKVT 
LIiMVGLDNAGKTATAKG I QGE YPEDVAPTVG FS KINI»RQGKFE V* 
TI FDLGGG IRIRGI WKNYYAES YGVI FWDSS DEERNEETKEAM 
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£EQ 

rn 
1U 

NO: 


| Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
{A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q*Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
^Tryptophan, ^Tyrosine, X-Unknown , *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SSMLRHPRISGKPILVJ^ANKQDKEGALGEADVIECLSLEKliVWE 
HKCL 


5576 


2 


93 0 


FVSSPPPRPVQPARPGGPGLSGRRSLLCQVASTPAHVGVMRSPV 
RDLARNDGEES TDRTPLLPGAPRAEAAP VCCS ARYNIiAI LAF FG 
FFIVYALRVNLSVALVDMVDSNTTLEDNRTSKACPEHSAPIKVH 
HNQTGKKYQWDAETQGWILGSFFYG YI ITQr PGGYVAS KIGGKM 
LLGFGILGTAVOTLFTPIAADLGVGPLIVLRALEX5LGEGVTFPA 
MHAMWSSWAPPLERSKLLSISYAGAQLGTVrSLPLSGIICYYMN 
WT YVF YFFGT I GIFW FLLWI WI»VS DTPQKHKR I SH YE KE YI LS S 
L 


5577 


1 


1028 


PPRDGFLELRRLSVPLCSGPCPLTSLSRQGERSGGHnVAAARAA 
VTAE THPL P LLAPLAVCQS VKS PAACQVRPR PRAVAL PAALGGP 
GRSLPGLTAATMSSFSESALEKKLSELSNSQQSVQTIiSLWLIHH 
RKHAGPIVSVWHRELRKAKSNRKXTFLYIANDVTQNSKRKGPEF 
TREFESVLVDAFSE1VAREADEGCKKPLERLLNIWQERSVYGGEF 
IQQLKIiSMEDSKSFPPKATEEKKSLKRTFQQIQEEEDDDYPGSY. 
S PQDPS AGPLLTEELI KAIiQDLENAASGDATVRQKIAS LPQEVQ 
DVSLLEKIT0K3AAERLSKTVDEACLRNRGPGTS 


567B 


3 


593 


SSSPPSSTPSLPLPFYLLLGQLRLQLLWGTAHLSGAGEAAPCPG 
GSGRTAAPRTRADPAAQSXjMIMNKMKNFKRRFSLSVPRTETIEE 
SLAEFrEQFWQLHNRRNENLQLGPLGRDPPQECSTFSPTDSGEE 
PGQLSPGVQFQRRQNQRRFSMEVRASGALPRQVAGCTHKGVHRR 
AAALQPDFDVSKRLSLPMDI 


5679 


2 


623 


LNSRVDDFVAVPGAIMDEDYYGSAAEWGDEADGGQQEDDSGEGE 
DDAEVQQECLHKFSTRDYIMEPS IFNTLKRYFQAGGS PENVIQL 
LSENYTAVAQTVNLLAE WL I QTGVEPVQVQETVENHLKS1.LI KH 
FDPRKADS I FTE EGETPAWLEQM I AHTTWRDLF Y KLAEAH PDCL 
MLNFTVKVGRVLELRRKVFMNVYFWLLVCFL 


5680 


258 


592 


RRLTSTSEKLQNRNSHTPLESLIHPQPSYKGFGIMF^KKKIE 
ISGPSNFEHRVHTGFDPQEQKFTGLPQQ^SLLADTANRPKPMV 
DPSC I TPIQLAPMKTIVRGNKPC 


5681 
c?£'an 


45 


869 


LLCAKTLGVRTKES0AEGYNRSG1NNHQAEDPRFCPSFCWMRSA 
RQTRPQRLRKEAARPPTPGSCPGGTGMDGKKCSVWMFLPLVFTL 
FTSAGLWIVYFIAVEDDKILPLNSAERKPGVKHAPYISIAGDDP 
PASCVFSQVMNMAAFI^LWAVI^FIQIjKPKVLNPWLNISGL^ 
IiCLAS FGMTLLGNFQI/TNDEE IHNVGTS LT FGFGTLTCW I QAAX> 
TLKVNT KNEGRRVGI PRV I LSAS I TLCVGPLLHPHGPKHPHVCS 
QGPVGPGHVL 


5o82 


39 


622 


PSRSCLGTMRKWRHREVNLPEVTQQDAVCPAPIPSPGLSAQTGIi"" 
QKIWGTIHCQVCPGAPAWPGSPWHEEMGIiLIiLVPIiLIiIiPGSYGL 
PFYWGFYYSNSANDQNLGNGHGKDLLNGVKIjVVErPEETLFTYQ 
GAS VI LP CRYRYEPALVS PRRVRVKWWKLSENGAPEKDVLVAIG 
LRHRS FGDYQGRVHLRQD 


5683 


89 


778 


GS GGATALITRCLAWS VL I S RLAMAT YTC ITCRVAFRDADMQRA 
HYKTDWHRYNLRRKVASMAPVTAEGFQERVRAQRAVAEEESKGS 
ATYCTVCSKKFASFNAYENHLKSRRHVELEKKAVQAVNRKVEI4M 
NEKNLEKGLGVDS VDKDAMNAAI QQA I KAQP S MS P KKAP PAPAK 

EARNWAVGTGGRGTHDRDPSEKPPRLQWFEQQAKKLAKHSEDD 
SEDEEHDLC 


5684 


195 


677 


x yi\-f fa* i jjiyfK v x in a^ujua x Lii. vbi L> v SAKYRGAFCEAKI KT 
AKRL VKVKVTFRHDS STVE VQDDHI KG P LECVGAI VE VKNLDGAY 
Q3AVI NKLTDAS WYTWFDDGDEKTLRRS3LCLKGERHFAES ET 
LDQLPLTNPEHFGTP VIG KKTNRGRRYE 


5685 


779 


1262 


LXiLCWPVVKCFiLFPPFRFSHHMIPGPPGPHTTGIPHPAIvTPQ - 
VKQEHPHTDSDLMHVK PQHEQRKEQEPKR PHIKKPLNAFML YMK 
EMRANWAECTLKESAAINQILGRRWHALSREEQAKYYELARKE 
RQLHMQLYPG WSARDNYVS P S S I PVAIiHS 


5686 


128 


1181 


CTWWQVNITLI^INDNHPTWKDAPYYINLVEOTPPDSDVTTVVA 
VDPDLGENGTIjVYSrQPPNKFYSLNSTTGKlRTTHAMLDRENPD 
PHE AELMRK I WS VTDCGRP PLKATSS ATVFVNLLDLNDND PTF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 

<-»*-< copunu xiig 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
<A*Alanine, OCysteine, D^Aspartic Acid, E*= 
Glutamic Acid, F=Phenyl alanine, G^Glycine, 
H=Histidine, I^Isoleucine, K~Lysine, 
L-Leucine, M=Methxonine, N=Asparagine, 
P^Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
WsTryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide "insertion) 








QNLPFVAE VJUEGI PAGVS I YQWAI DLDEGLNGLVS YRMP VGMP 
RMDFLINSSSGVWTTTELDRBRIAEYQLRWASDAGTPTKSST 
STLTIHVLDVNDETPTFFPAVYNVSVSEDVPR\GSGWSG*AARN 
NDVGLMAELSYFITGGWVDGKFSVGYRDAWRTWGLDRETTAA 
YML I tiEAI D WGP VG KRHTGTAT VFVTVLD VNDKRPI ILQSSYV 


5687 


17 


917 


AAPPAPPDG/PPP/PPPAPPT/PGPAA/APASSCQPRLSAGRAA' 
QGDGGAAAVGHVLWP AVGPVR VNPGLQTP VPR PELLPG P \ S S S 
LHSDSSYPPDAGLSDDEEPPDASLPPDPPPLTVP/ADA/PMPVT 
SGCRMPSTSASE /AAGGQGACTHAKGS ETPPPAS P0TSEPAPSP 
LPPH3jTGGPGMYSSEAKLPNSFSCLGLAGTGAGI*GTASAHGTG 
PPVLPHVCTPSLANPQP\AVGPEASSLPLGVSGIGMSA/SAPrS 
SSPFVAIGSCWLRGIPPPGSGFLCPGRAPGPVPITTHGQEGQGP 
VLDI 


5688 


1 


420 


LTKWDLFGNC YRLLKTG IEHGAMPEQVGVYWYS / CL YDSRKLF~ 

*SHMIIRSLL*KVIDDSLGQLPLLRELtiIi* *LNVIDRCI ILAYV 

I^VEKTFAJTYLKNFTVKVDFSLLGEIPLISMAAILiKLWIMKID 
DGYIPAVF 


5689 


1504 


3 


HELSG KHI SMVSGNTCNWHPGGHS PGGGGQGE I TSKDRGE I PAL 
I WA/RK?IGTWTATKPTHRAG*GGAEE YQPPPQP CEGPRS TSRG 
GEG*GHAVGPGREIGKEGSLPFLGPKALGF*SASCQRAFEGGAH 
GSTARKPAPATPGTRHPRTMETREVAQGWPAGPRSQFWDQHPHS 
PGEHRPSG\SPLPACPPRAWPKAGAVASATGTG\PQLPGSRGKQ 
KL PRTRE PPLLQAG WAVR KP P WS EAKEGLGQAGR PSGMDS S AS \ 
PQTPGGRGSLEVJGLPLYLGPHHDVK*RSDRLG+ PP * GGQGGGGH 
GAPSTPGPGGEAW*LPQQTSRPKPGPQAY*GE\GSPGLQCPCSK 
EL*RVPPGSLGPSTQCKYEPTDKHS\GGADAQLEVSTAGSRSTF 
GQELKGPLDAGRLWPGAPSASSSHR*GG*ERARAGAGHRGST*A 
S SKIEQGRPRPGPTSDALADVEGGAES /GPHPWPLPGTLPNR/ P 
GS PP PA* AS AGRKGTVSTLGGGLL 


5690 


1424 


58 


P5 PPAGVCAAPAP IiPLIALARRDRRPCS PGAEAAPWQTGGPAI D 
GAWRTS VSALRRGATG/ APCS PGAEAAP WQTGG PAI DG\ DGEL P 
* VRSEE APRGCGAEGGGPGSGP VRR PGAGRGAHAGQGRQQDPE P 
DGLRHRQHGAASJIARHRLQRLRPGHHQNRHVRRDPQAPPGGPAP 
GHAAALPERTRG VAE PPAWAHAGSDAWRAGR* SQRT * ERAR PRH 
PTFQGRAGS \GQPGYQPPNPH PGPSS PPAAP\GPRGA* GNPQLE 
KAPRSDRWPSQGIjRTRIRRPETPDCGPPSPAGSSASASTFRCTS 
SLSLLGP/PGAHNLDTAPQDR*HGP*GDKRGAPGVAGEDPRPP+ 
GNFVR * LLLMP/GVA* RHGTS PFLGPSLGENGGQWDSGNLFGTP 
KG* SHP AFTKST * SMEAE KS YWNHPHR \ DRGRQG VR I NCLR VGE 
SEMWGPYSAPRPGTVFLSSFLSPASEEH\PEGSSSFNTPFPPAG 
PEGDPGLNSPGLLP 


5691 


107 


550 


ISNDPSPGYNIEQMAKRGKKLVELPYTVKGMDVSFSGZLSFlED 
VAHRMIiATGECTPEDLCFSLQVMQ*KTGTESWG*RFYIVEON*S 
GDAPLIFSPYI*SLTGNCGFAMLVEITERAMAH\OGSPGGPSLWG 
GVGVYVLLESVPXjSYS 


5^92 • 


1193 


54 8 


TQAWTRAEKDRKGSVRALRLHLERGPPT*RGSHPL\QSVPCIQK " 
PSIFSSYPI/GLPQSGGEPGPVGEQQPVRRPEQPSCGPASRMPI, 
TSRSVPPGRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQ 
RLNLPVMGATRSNLQPPRKVAVPGPTR* RDQDSKQDFSS KPLQS 
VPGLASTOOTLTPADS GPGTGGRDATR ACJI.P fl VPTtvrnwmm 


5693 


1258 


1330 


ALTWPWKGTTWWAQPHGCSNLVSRARLDLSSRPSQNTEPQAP " 
*QAGPP£SLRPP\SRRR*APEWPKRATGSRCRGLSAPPWPWPAA 
RGE/PGSAPSHAP/PNS?RPSGTRHP/PGPSSRVIiYSPSIiPRNS 
PEAIVWRSSRFPLWFPLRCCFWVSGFKDPNPVIiRFF 


5594 


3 


1338 

L 


GS KE P ARSLHRRGSGHKS S AGKWGS VtfliS TAGALG * KQLHQ * WT * 

QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 

SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 

CDDSSKGGELKKPISLGHPGSLKKGKTPPVAVTSPITHTAQSAL 

KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 

IARPS TSGS FG YKKP PPATGTATVMQTGGSATLS KIQKSSGIPV 
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ID 
NO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Ee 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L^Leucine, MsMethionine, N=Asparagine r 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine , V=Valine, 
W-Tryptophan, Y«Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=*possible nucleotide insertion) 


— ccoc — 






KPVNGRKTSLDVSNSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 
VTGGRGGPRP VSSS I DPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDREKEKAJCAKAVALDSDNISXjKSIGSPESTPKNQASH 
PTATKLAELPPTPLRATAKSFVKPPSLANL0KVNSNSI.DLPSSS 
DTTQCI 


5695 


3 


1338 


GS KEPARSLHRRGSGHKSSAGKWGS VTLSTAGALG* KQLHQ* WT 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESGL5 W FS ESEE KAPKKL E YDSGSLKMEPGTS KWRRER PE S 
CDDS SKGGELKKP I S LGHPG S LKKGKTP PVAVTS P I THTAQSAL 
KVAGKP EG KATDKG KLAVKNTGLQR S SS DAGRDRLSDAKKP P S G 
IARPSTSGSFGYKKPPPATGTATVMQTGGSATLSKIQKSSGIPV 
KPVNGRKTSLDVSKSAEPGFLAPGARSNIQYRSIiPRPAKSSSMS 
VTGGRGGPRP VSSS I DPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDRE KBKAKAKAVALDS DN 1 5L KS IG S PEST PKNQAS H 
PTATKLAELPPTPLRATAKSFVKPPS1ANLDKVNSNSLDLPSSS 
DTTQCI 


5696 


3 


1338 


GS KE PARSLHRRGSGHKS SAGKWGS VTLSTAGALG * KQLHQ * WT 
QRCL\NNLSSE3FNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CDDSS KGGELKKPISLGHPGSLKKGKTP PVAVTS PITHTAQSAL 
KVAGKPBGKATDKGKLAVKNTGLQRSS SDAGRDRLSDAKKPPSG 
IARPSTSGSFGYKKPPPATGTATVMQTGGSATLSKIQKSSGIPV 
KP VNGRKTS LDVSNSAE PG FLAPGARSNIQYRSLPRPAKS SS MS 
VTGGRGGPR PVS SS IDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDRE KEKAKAKAVALDSDNI S LKS I GS PE STP KNQASH 
PTATKLAELPPTPLRATAKSFVKPPSLANLDKVNSNSLDLPSSS 
DTTQCI 


5697 


1147 


47 


PSEAL S PPAC P S APAPRRS 1 1 SRL FGTS P ATEAAP PPPEPVPAA 
QGPATVQS VED F VPDDRLDRS FLEDTT PARDE KKVGAKAAQQDS 
DSDGE ALGGNPM VAGFQDDVDLE DQ P RGS PPLPAGPVPS QDI TL 
SSEEEAEVAAP7KGPAPAPQQCSEPETKWSSIPASKPRRGTAPT 
RTAAPPWPGGVSVRTGPEKRSSTRPPAEMEPGKGEQASSSESDP 
EGPIAAQMLSFVMDDPDFESEGSDTQRRADDFPVRDDPSDVTDE 
DEGPAEPPPPPKLPLPAFRLKNDSDLFGLGLEEAGPKESSEEGK 
EGKTPS KENKKKKKKGKEEE EKAAKKKS KHKKS KD KE EGKESRR 
RRQQRPPRSRERTAA 


5698 


2 


666 


GARAAEPQEDLPPLSQSSRFFQEQQKMNKSLGPVSFKDVAVDFT 
QEEKQQLDPEQKITYRDVMLENYSNLVSVGYHIIKPDVISKLEQ 
GEEPWIVEGEPLLQSYPDEVWQTDDLIERIQEEENKPS.RQTVFI 
ETLI*R/ERGNVPGNTFDVETNPVPSRKIAYTHSLCNSCER\GF 
NASSBYISSDGRYARMKADECSGCGKSLLHIKLEKTHPGDQAVE 
FNQ 


5£99 


2 


1448 


rvrqppglwvrrtvpamqcpaglsrvpgvag/dpslpsfrgprd 
eaahrgtiqtarhtrklyvqgpasgpplprvstqva i *dekpla 
rps/grtnapfpqgqkpag kaapgpaaagrvamr\ pghpgllas 
dsqrssskgsgwetpvpws*aqpgwvsgllllgdpsgpgsl*rs 

TWLVGGARGPEGSGVRGSGWPSGCSDIGWALAGW2SHS*HLDPNT 

wtqkwtge / spapgeeg\vapaprgptaehghcelttesqysnn 
vpilfqnpsgalrsrrtepagwvpptrhe+ddg*taapasggap 
ifiWAO I f/ lnaslgptdpqgkpgcrppcalpkpagpersa* 
ggslgcr/smlpassgpppapgprrlaagahtsasarcppaaaa 
gwq prr pg fagraal pg pphp pss * relgglpgpgw * tldplpa 
hpahppgsappwgalggwaaaraslpwspslclsfpavtpvagl 

FPPGRG 


5700 


923 


597 


NGHKGVWEINIY*RRSNIHKNSKSESHLNQDHSFPPPTPNSARS 

klhstgtakntglplsgaprqravfsgrticqefssclqcayld 
e*csiasslikailrvsvlse 


5701 


59 


410 


i feki csdtqe fl s pe inpq i cswli fdkgak/nhatgkdslfn 
kwswknwlstcr*mrpgpyftpytkinsk*ik/danircetvkl 
leentgenlhdtglgnvfldmtpktqptkqk 
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nucleotide 
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amino acid 
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Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, C=Cysteine, D=Aspariic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S* Serine, T==Threonine , V^Valine, 
W^Tryptophan, Y=Tyrosine, X-Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5702 


. 3 


1517 


ETFVDPSQCGGIPSDSPHPVITPSRASESSASSDGPHPVITPSR 
ASESSASSDGPHPVITPSRASESSASSDGLHPVITPSRASESSA 
SSDGPHPV2TPSRASESSASSDGPHPVITPSRASESSASSDGLH 
P V I TP SRAS ES S ASSDGPHP VITPSWS PG SDVTLLAE ALVTVTN 
IEVICJCSITEIETTTSSIPGASDTDLIPTEGVKASSTSDPPALP 
DS TEA K PH I TE VTAS AETLSTAGTTE SAAPHATVGTPL PTNSAT 
EREVTAPGATTLSGALVTVSRNPLEETSAL3VETPSYVKVSGAA 
PVS I EAGS AVG KTTSFAG5SASS YSP SEAAljKNFTP S ETLTMD I 
TTKGPFPTSRDPLPSVPPTTTNSSRGTNSTLAKITTSAKTTMKP 
PTATPTTARTRPTT\A*VQVKMEVSSSCG*VWLPRKTSIiTPEWQ 
KG * CS SS TGNS TPTRLTS RS P YC VSGEANG / P S AAARHVP YAKR 
GCCP * PGPPPTDCS C VT VIjRGTQ KVPMKGSMS KPLTPDVATGPS 
LTSTG VYVWGGAS P VPRG VLGLTLAHVLC FS KEKT 


5703 


14 


1117 


HHKDSRSQGLPRTQECARPELRPLLCPRALWPVTRLSYRCPWQA 
PKAGIGTKAKPSESHLKLHPGWPSLDRQGEPATLGTGTGHCSDS 
RILRWHP*HTAAR* PRWRRLPSSHRWTRHLGVLRVQDKS * * VSL 
DPSCRPRFLRTC*+YGMRSVASSSNPPPGWSGPGASVFPARPVS 
ALPXGPRCW*APRGRTRQPCGWPRLSSPHATADWGPGCPLSPSR 
GSWETAPGS*WCPWL*AARWTGWRTASGASAGLGRAADRPSAWA 
RRVAGLLPGOGLTVRR*H*TAGAPASVRSSQGATRSPAPGGDQC 
AOGRGPGSC*HPPPWPVSPSSPVPCPSGR*HLRGPLLSAARPRA 
AGWPRHSPHDTQTPEP 


S704 


23 


562 


GDYEPDSP YWDDISQAAKDLVTRLME VEQDCRITAEEA1 SHEW I 
SGNAASDKNI KDGVCAQ IEKNFARAKWKKAVRVTTLMKRLRAPR 
QSS TAAAQS ASATDTATPGAAGGATAAAASG ATS APEGDAARAA 
KSDNVAPRRP * LPPQPQMEVPPQPLMAVSPQPPMEASLQPIiMGE 
SPQP 


5705 


23 


56*2 


GDYE FDSP YWDDI SQAAKDLVTRIiM E VEQDQRI TAEEAI SHEWI " 
SGNAASDKN I KDGVCAQ I E KNFARAKW KKAVR VTTLMKR LRAPE 
QSSTAAAQSASATDTATPGAAGGATAAAASGATSAPEGDAARAA 
KSDNVAPRRP * LP PQ PQME VPPQ PLMAVSPQP PMEASLQ PLMGE 
SPQP 


5706 


1161 


610 


QLGRFXAQDTVAIRKVKEVFGTGAMRHWILFTHKED*GGQALD 
D YVANTDNC S LKDLVRE CERRYCAFNWWGS VE EQRQQQAELLAV 
IERLGREREGSPHSNDLFLDAQLLQRTGAGACQEDYRQYQAKVE 
WQVEKHKQEtiRENESNWAYKALLRVKHLMLLHYEIFVFLLZiCSI 
LFFIIFLF 


5707 


28 


609 


GSPAPTPGFRRRPGRGTPSPGTRHHQGRABPEPDAPERAPLRR* 
MFA1 Q PGLAEGGQ FLGDPPPGLCQPELQPDSNSNFWAS AKDANE 
NWHGM PGRVE PILRRSSSES PSDNQAFQAPGS PEEGVRSPPEGA 
EIPGAEPEKMGGAGTVCSPLEDNGYASSSLSIDSRSSSPBPACG 
TPRGPGPPDPLLPSVAQA 


5708 


44 


1925 


S FS WEETISPCFPKMPAE PWWLS PVS LGAAG WP GQPRP YLDLPA 
Q AS VSRPHDRA* GEAVS LSLS SGD VCGHTDGGGAGSDPQAXPKP 
PRCPFTAMPSPRTKQKVRNKVCLLIAIRYSDIPSDVSKAP\GPA 
GNPHDRSSTAA*LHRRAGAGSLCLSASLLPPSFSLGAPGAPSPL 
RVSPASGGPRKEGRQGSGG* AGGGGP \ARTHADLPCVGFVCS PP 
LLK*SDSPVKQLPA\SGQGSGAGMPPVGSSDILRPRPTSVSGTG 
RAAG * CS WQPAACCTPRSQ* WAVARSPSRCSRW* RQSGR+RG* S 
S RRRRGP * AAGRS TPAVP * P CS * GGAGRRAYACRTaWR Y» p qp * 
LEPSGPTSGSAL* TWASHSTGA* *SRLCGTAGTGPLCSQSSRS * 
AG*RCCCTAASPCGGSGPSHPGSPSAHCLSWSGGRTQPRAPSAH 
GRGRAMGSRCVCTCTGLPC PG I PLSGAS P GGSGETGAGRSHTLK 
AARSRLSPRPGSGSRGSY+SHNDNWGTWPAPPSAGHLLVGG*NS 
QRTSSDH*YTGTRRPWAGPGTRCSTAPSRAAPPVSRCRPPPPPP 
PPRPPRLPAAAS / SGGASGS PAASCSCSCRAPAKPASS/GEAPA 
PPPRPEPPPPPARRP 


5709 


2 


2031 


ITLCPLPQTBKCLNVVTEAATPLGI YIiKARVEAGGLKELEI S WG 
LHQIVVRWGAVVMRAGMGGCRCWGVP1APFAPR/NALS FLVNDCS 
L I HNNVCMAAVFVDRAGE WKLGGLD YM ^SAQGNGGGPPRKGX PE 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, CeCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine , M=Methionine , NsAsparagine , 
P* Proline, Q~Glut amine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y~Tyrosine, X=Unknown, *=>Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEQYDPPEIiADSSGRVVREKRSADMWRLGCLIWEVFNGPLPRAA 
ALRNPG KI P KTLVPH YCEL VGANP K VRPN P ARFLQNCRA PGGFM 
SNR FVETNLFLEE IQI KEP AEKQKFFQELS KSLDAFPEDFCRHK 
VLPQLLTAFEFGNAGAWLTPLFKVGKFLSAEEYQQKI I PVWK 
MPS STDRAMR I RLLQQMEQ FI Q YLDEPTVNTQ I FPHWHG FliDT 
NPAIREQT VKS M DIiLAPKLNEAKLNVELMKHFARLQAKDEQGP I 
RCNTTVCIiGKIGSYIiSASTRHRVLTSAFSRATRDPFAPSRVAGV 
LGFAATHNIiYSMNDCAQKILPVLCGLTVDPEKSVRDQAFKAIRS. 
FLSKLE3VS EDPTQLEE VEKDVHAASS PGMGGAAASWAGWAVTG 
VSSI/TSKLIRSHPTTAPTETNIPQRPTPEGVPAPAPTPVPATPT 
TS GHWETQE EDKDTAEDS S TADRWDDEDWG S LEQEAES VLAQQD 
DWSTGGQVSRASQVS\TPTTNPPNPQSPTGAAGK\RGLI»GTGLA 
GAKLPGATS * RYTAGQRV 


5710 


1 


562 


I PGS T I S CE VELMARMAKT IDSFTQNOTRLWI IDGIiEACEQDK 
VLQMIiDTVRVLFSKGPFlAI FASDPHI I IKAI WQNLMSVPSGFK 
\IjNGHD YMRNI VHIiPVFLNSRGIi/RQ/ LQENFS * LQQQMETFHA 
QILQGYRKKLTEEFHRTALGR*QNLVARQP£ I DG* DAIG FELYV 
CIAI QFNTNKDDAT 


5711 


1526 


1130 


RRHPFQWTTVTQEAFSHHDVAFTSTPVLFYPDSAQPFIVKSES3 
SQIAKAVLSQQRPSLFHECAFHFFS * SLQRHTINLDQGI F*L I*M 
tiSEERQHLFES S / 1 WTTPHNLK* /FE IHEHLGSHEGHWTLFFU, 
QIL 


5712 


3 


1331 


GRKLFQSLDISERLECFLLTLDCVDDTLXVIiAEEHGCLDUKELP 
ETVI Dl »T .NKCLTFHPSKRPTPDELMKDKVFSE VSPLYTPPTKPA 
SLFSSSLRCADLTLPEDI SQLCKDINNDYLAERS IEE VY YLWCL 
AGGDLEKELVNKEI IRSKPPICTLPNFLFEDGES FGQGRDRSS / 
TFR* YHWD IWMPAKK* IERCWGRSILP ITLKMTSI* I LPYSNSN 
NELSAAATLPLI IREKDTE YQLNRI I LFDRLLKAYP YKKNQI WK 
EARVDI P PLMRGLTWAALLG VEGAI HAK YDAIDKDT P I PTDRQI 
EVDIPRCHQYDELLSSPEGHAKFRRVLKAWWSHPDLVYWQGLD 
S LCAPF L YIiNFNNEALVYACMSAFI P K YLYNFFLKDNSHVI QEY 
LTVFSQMIAFHDPELSNHLNEIGFIPDLYAIPWFLTMFTHVFPL 
HKI FHLW \ DTLLLGEFLFP I I*YWE 


5713 


634 


284 


PVCAVPVDRWPVLPREDQEGQQL*AKLPRDFRR*FQIU3PMEGH 
TACRCS RRG AQVQHIi PREDI RAAE * DPHLRE VWPGL PTSS ATS P 
*RAVLTSPCSHLGSADAASSHWLCGVSFH 


5714 


212 


613 


WGLGLG PTMSS LGGGS QDAGGS SSS S 7NG5GGSG5 SGPKAGAAD 
KS AVVAAAAPAS VADDTP PPERRNKS G I ISEPLNKSLRRSRPLS 
HYSSFGS SGGSGGGSMMGGESADKATAAAAAASLLANGHDLAAA 
MA 


5715 


131 


1979 


ESASQQKRSKCLILTLKLELSGSAPKKTSARPGSSLWLPPHSQE 
QTPPASKLQGGGGGLQTGWGLHPVPVTAASPLPRWCIiFGAVAK\ 
GLPGP * LCPSGAA/ GGLQRGPGLS PLGAAGKVSCLHPPSMVENN 
DSTCHEHHEGILAARVTPVP\SGKPGRVIjKPPGRVCRPPHPAAS 
PRP PGS /S DLDGPRP QMHLRAFP AAHGG PVNTPHGGE EKT FMS S 

QIRRKETKPL* rktpag\nnyqsktsi pvsqspqltvdllpsagr 

TQAPSGRGDAGKPTPGHG \LPKAS VI LTPNCPCSLAGGQ* PPGL 
YPKTPKQRRWRRPL/LLGPSQ*GSRQSTC*EV\GALGEPVRIPG 
L*PDLSCIliSfJGSKHRREGI*SFPRSLGPGRRGPAGIiQSLGCSPT 
e ItNTACHS SGHVAIjQAGHDS ARDVGSGHVALQAGHDS tqdvgrp 

VWRWIPIiE * lglsretgqatrrglvwis pgraaaacvacaqale 
eg ptirlpgqdrgaqpcs hcp graagqp bpgagapcre /gg* dpt 

GLT/GVPGTDPKRGGRKPGQSGQBTQGPTVWSGPESPLQPKP*E 
RQE / VGAGAS SGVGLSRGRAGGPS S AWEVAAMLLLLRHGSHSEL 
TDLTEAQTSQH 


5716 


1711 


1370 


RVFSLLCEGPGHCYQGAVCREACAAASPGLDSAAEPHRLCEHTD 
*LPK*GPGYIQHFHCDSNILCILYNISFNliFSYSF*GVARYAC* 
RCFLVL*SGFFTIIVGGYSCCMPLXT 


5717 


44 


1489 


LPTEALRESEWVSEYGKCGPRGLVPEGESTSPLP^SVDTED^LD 
EGPGAI*VLESDLLLGQDLEFEEEEEEEEGDGNSDQLMGFEROSE 
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corresponding 
to first 
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residue of 
amino acid 
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Predicted end 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N»Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyxosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=*possible nucleotide insertion) 








GDSLGARPGIiPYGLSDDESGGGRALSAESEVEEPARGPGSARGE 
RPGPACQLCGGPTGEGPC03AGGPGGGPLLPPRLLYSCRLCTFV 
SHYSSHLKRHMQTHSGEKPFRCX3RCPYASAQriVNLTRHTRTHTG 
EKPYRCPHCPFACSSLGNLRRHQRTHAGPPTPPCPTCGFRCCTP 
RPARPPSPTEQEGAVPRRPEDALLLPDLSLHVPPGGASFXiPDCG 
Q\ CGVKGRASAGLDQNHCQS / SLFPWTCRGCGQELE EGEGSRLG 
AAMCGRCMRGEAGGGASGGPQGPSDKGFACSLCPFATHYPNHLA 
RHMKTHSGEKPFRCARCPYASAHLDNLKRHQRVHTGEKPYKCPI* 
CP YACGNLANLKRKGR I H SGDKPFRCSL CN YS CNQSMNLIRHM 


5718 


120 


284 


VAHALSLPAESYGNDVSMTHPQLPPTQLAWDLCRTCLPLSYNFT 
S**STADPLHL 


5719 


43 


428 


ELNNGPFQMPIiCNGGNIiAVTGSWADRSPLHRAASQGRLLALRTli" -- 
h S QG YNWAVTLDHVT PLHE ACLGDHVACWRTLL E AGANVNAI T 
IDGVTPLFNACSQGSPSCAELLLEYGAKAQP\ESCLPSP 


5720 


1 


1051 


LQAFRNASE VPMVLVGTQDAISAA ^NPRVYRRTSRARKIjSTDLK 
\RCT\YYE\TCGGTYGLQMWSVSFQDVAQKWAL\RKKQQ\LAI 
GPCK\SLPK\SPSH\SAVSAASIPARAPINQGHE/SGGGSAFSD 
Y\SSSVPSTPSISQRE1>RIETIAASSTPTPIRKQSKRRSNIFTS 
RKGADP \DRE KKAAGC KVDS I GS GRA I P I KQG I LIiKRSG KSL>NK 
EWKKKYVTLCDNGLLTYHPSriHDYMQWIHGKEIDLLRTTVKVPG 
KRLPRATPATAPGTSPRAWGLSVERSKTQLGGGTGAPHSASSAS 

LHSERPLSSS AI«VGPRPEGLHQRS CSVS SADQWSEATTSLPPGM 
QHPASG 


5721 


97 


492 


RHS S P C CSiiRRTERSSNAAVST / T T VQQFKR F I ENYRRH 1 GC VA 
VF YAI AGGLFLERAY Y YAFAAHHTG ITDTTRVG I ILSRGTAASI 
SFM FS Y ILLTM CRNLI TFLRETFLNRYVPFDAAVDFHR L I AS TA 


5722 


83 


1043 


VALDVI^SSPGGGC^GALIjGPRVHGIRAVLRV'ARGGVQAPGAP 
GSLG VSHAAAP PARPQGAAQS PHRGRRHGGGGAGLP P PRS PR FP 
QESVPASTSTARGPRRVSRRIjPPQHPGPRGRRRRPGAGVGAPRR 

grargqagllgrqgqggrga3reraalqarrgrrpgpepdqs cg 
grprraaaapgrapadpqppaprpapapdvrppadapapapapa 
pp ppphlgaltagsgeerqs qpraetlrlgrgaplp\ praergg 
rpkqaeqqqSpkrptppargpqssgdpamlpqraglrtgglagt 
ksstreipemi 


5723 


88 


1043 


valdvlagss pgggmagallg prvhd i ravlrvargg vq apgap 
gslgvshaaapparpqgaaqsphrgrrhggggaglppprsprfp 
qesvpaststargprrvsrrlppqhpgprgrrrrpgagvgaprr 
grargqagulgrqgqggrgaereraalqarrgrrpgpepdqscg 

GRPRRAAAAFGRAPADPQPPAPRPAPAPDVRPPADAPAPAPAPA 

pppp phligaltagsge erqsqpraeti/rlgrgaplp \ praergg 
rpkqaeqqq\pkrptppargpqssgdpamlpqraglrtgglagt 
ksstreipemi 


5724 


3 


1841 


ftneappaplpdasasplsphrraksldrrstepsvtpdllnfk 
kgwittkqyedgqwkkhwfaladqslryyrdsvaeeaadldge id 
lsacydvteypvqrnygfqihtkegeftlsamtsgirrnwiqti 
mkhvhpttapdvtsslpeeknks5csfetcprptekqeaelgep 
dpbqkrsrare\rrregrsktfdwaefrpiqqalaqervggvgp 
adth\dpwrpeaehgelererarrreerrkrfgmldatdgpgte 
darlrmevdrs pglpmsdlkthnvhvei eqrwhqvettplreek 

QVP 1 AP VHLSS EDGGDRLSTHEL TSLLE KELEQSQKEASDLLEQ 
NRL L^DQLRVALGREQSAREG YVLQATCE RGFAAMEETHQ KKI E 
DLQRQHQRELEKLREEKDRLLAEETAATI SAIEAMKNAHREEME 
RELEKSQRSQISSVNSDVEALRRQYLEEIiQSVQRELEVLSEQYS 
QKCLENAHLAQALEAE RQALRQ CQRENQE LNAHNQELNNRLAAE 
ITRLiRTLLTGDGGGEATGSPLAQGKDAYEIiEVPSGARPCLTQIfC 
TQEPQGSAAWPLSYRWGGTDLRQQESQGPGRSKSPEGGEEQ 


5725 ■ 


3 


1Q49 


VNGHSEE TSQS PNRTEPHDSDCS VDLGI SKST3DLS PQKSGPVG 
S WKSHS I TKfME IGGLKI YD I LSDN\ DLS SHLQPLK/ FTSAVDG 
KWIVRSKAATLLYDQPLQVFTGSSSS SDLISGTKAI FKFDSMHN 
P E / GAKYNKRPHKWAHNLKLKYMVLHS 1 1 SNTVAV \ RSQRHFVA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, OCysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, Isleoleucine, K=I*ysine, 
L-Leucine, M-Kethionine, N=Asparagine, 
P=Proline, Q*=Glut amine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








IjQTKSPNRPCQPSSSAPS/VDQRAQ/INQSYAKHSANMNPSNHN 
NVRANTAYHLHQRLGPARHGEMWAI S PNDRLI PAVTRS TI QRQS 
SVSSTASVNLGDPGSTRRAQIPEGDYLSYREFHSAGRTPPMMPG 
SQRPLSART YS IDGPNASRPQSARPS INB I PERTMS VSDFNYSR 
TSP 


5726 


2 


486 


SRSLSMWWNSGLPASSHSSKLPVTVGFSGCVKRLRLHGRPLGAP 
TRMAGVTPCI LGPLEAGLF FPGSGGVI TL/ ES VGAGI PGPSRAG 
\ QGSPGGSGEGPPLSSPSQPLPADLPGATLPDVGLELEVRPLAVT 
GLI FHLG QARTP P YLQLQ VTEKQ VIiLRADDG 


5727 


21 


221 


RPILILKETRRLPWATGYAEVINAGKSTHNEDQASCEVLTVKKK 
AGAVTST PNRNSS KRRS S LPNGE 


5728 


2 


877 


gtrkgqfeprrgrawegsagglrApgaaaggpgvqprgsg/lpg" 
nairagvnpgrgpaspfwdlslpwdlwppptdhapgapdfpave 
GR\ pwaggrppwpvsgvlgsrvcgplystspagpg/sggls PSQ 

GGPAGAGGDAG/LPGRCPSAPWRAGSRPAASCPDWIPGPQGLWIi 
HRNPTS/GPPSQIGEGAEQGDEGVADAPQIQCKN/GAEDPPAED 
EPPQVPEAGEEDAVPAEEGPGGTPETQADQVRERPEAHLAEGGA 
KGSPRRLADPQDtjPAGQMSLAPPFPPVAAVIRSNK 


5729 


1 


1525 


AGGAREVLTLgLGHFAGFVGAHWWNQQDAALGRATDSKEPPGEL 
CPD VL YRTGRTLKGQETYTPRL ILMD hKGSLSSLKEEGGLYKDK 
QLDAAIAWQGKLTTHKEELYPKNPYLQDFLSAEGVLSSDGVWRV 
KS I PNGKGSS PLPTATTPKPLIPTEAS IRVW S0FLRVHLHPRSI 
CM I QK YNHDGEAGRLEAFGQGES VI*KE PKYQEELEDRLHFYVEE 
CDYLQGFQILCDLHDGFSGVGAKAAELLQDEYSGRGIITWGLLP 
G P YHRGEAQRNI YRLLNTAFGLtVHLTAHSSLVCPliS LGGSLGLR 
PEPPVS FPYLHYDATLP FHCSAIIATALDTVTCS \ YRLCSS PVS 
PWHL\ADMLSF<^KKVVTAGAIIPFPLAPGQSLPDSLMQFGGAT 
PWTPLSACGEPSGTRCFAQSWLRGIDRACHTSQLTPGTPPPSA 
LHACTTGEE IIAQYLQQQQPGVMS S SHIiLLTPCRVAPPYPHLFS 
SCSPPGMVLDGSPKGAAVESVPVFG 


5730 


1258 


1713 


KKFQAPARETC VECQKTVYPME RLLANQQVFH I SCFRCS YCNNK 
IiSLGTYASLHGRIYCKPHFNQLFKSKGNYDEGFGHRPHKDLWAT 
KIETEGFWERPRNFENQGRPLKSPGGEDCPSC*GGCPGSNy*AQ 
GSSSREKGGQASWKPKLRVA 


5731 


122 


443 


RSHRGEL I PKDS CYMRKPPRRP BCKRRQG /CALPOGCLrFKOVA t 
EFSLEEWKCLKPAQRALYRAVMLENYRNLESVGLTSKDSWYMRK 
KPGRGRGKQRRQEWFFLRVY 


5732 


226 


772 


PPS RS CQSPRRKS RRRAHVTVTIiVCGFTSFSFSLPL YDCGCLRF 
PERTCSQIiQQADWAPDFGPSSFVPSWGATATGARKFLIAFNl\N 
IJ^TKEO^^IALNLREQGRGKDQPGRLKKVQGIG^LDEKNLA 
QVSTNIiLDFEVTALHT\TYEETCREAQEL.SLPVVGS01>VGLVPLK 
ALLDAA 


5733 


1 


460 


PALQEVNANALAWGKQYENDARTLFEFTSGVNDTESP 1 1 YRDES 
MRTACS PDGLCSDGNGLELKCPFTSRD FMKFRLGG FEAI KS AYM 
AQVQ YSMWVTRKNAW YFANYDPRMKREGLH YWI ERDE KYM \AS 
FDEI\VP\EFIGKMDEVLSRDPM 


5734 


3 


968 


RCNSPESLTSLI.VLLTTANNLFVLIPAYSKNRAYAIFFIVFTVI " 

GSLFLMNLLTAIIYSQFRGYLMKSLQTSLFRRRLGTRAAFEVLS 

SMVGBGGAFPQAVGVKPQNLLQVLQKVQLDSSHKQAMMEKVRSY 

HYYFDYLGNIiIAIiANLVS ICVFLVLDADVIiPAERDDFI IiGILNC 
VFIVYYIjLEMLLKVFALGLRGYLSYPSNVFDGLLTVVLLVLEIS 
TIi\VCTDCHTQAGGRRWW/RIiLSLWDMTRMLNMIjIVFRFLRIIP 

smkpmawastvlgl 


5735 


2 


540 


fftpcvarafnfpdqatvkkaayslprvgggtscglpqarr ISL " 
atprqlyk/ssnmtqrwqrreisnfeylmflntiagrtyndlnq 
ypvfpwvltnyes eel>dltlpgnfrdls kpigalnpkravfyae 
ryetweddqsppyhynthys tatstls wlvri VS I F ielaclwy 

LKILT 


5736 


1 


382 


gtrpstkksgyspqqvavihckghqkentavahskqkadsaaqv 
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SEQ 
ID 
NO: 


Predicted ~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
ift-ftianine, (--cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Fhenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=> Lysine, 
L= Leu cine, M=Me t hionine , N»Aeparagine , 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 

**/pLopaan, i-jyrosme, A=UnJtnown, *"=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TARLSVTPFNIiLFTVS FPQPDLPDNPVYSTTTEKIaASDLRANKN 
*** 0 ±uru&\3±t* if* 1 * IoXIiQSTTHIiRkAKLPQLLRR 


573 7 


290 


1041 


kaclhlls s fltsnflfnpll pds l ys vearsqranlgp crjrkr 
lqtlmrlaagfqysshkdpslsakeketdyhneargpwpgwvg* 

RTADGSCGRGPDGAHHPGPKSSSWRASRLLPGLGGSHHLDAYVG 
RDLECGTPAPLQLEIPPQPRGHPAPIPTGQAGPRDSGPGASP*V 
ETR PLTDGRR * PGVR PVG WTP AHPAGTLRPRGAVE P S VS ACGKW 
APS PTSQGCCEGRCDAVPKHRAWRTPLCSQ 


5738 


a 


460 


DTLS LNCTLPETLPMTPS F*LS FL * PPGLARAKSI PTKT YSNB V 
VTLWYR P PD I L LGS TDYSTQI DMW * GQVE VWQGPCGKGGGLVTT 

ATQPAAFLFTVPSLPRGVGCIFYEMATGRPLFPGSTVEEQLHFI 
FR I ZiSEEAWALCAVETHR 


5739 
5740 


1 


1222 


SFQRRGIRWNVHTLHPHPRAVWAGIGRGHGS *ALLGRARAPALC 
FPTLLEFLESLEPDLPALRAMGLHLWAAGPGTHPAGISDIiLAEV 
SAEVDGPVPGYLSS PQS ITDTCXYI FTSGTTGLPKAARISHLKI 
LQCQGFYQLCGVHQEDVIYLALPLYHMSGSLLGIVGCMGIGATV 
VLKSKFSAGQFWEDCQQHRVTVPQYIGELCRYLVNQPPSKAERG 
HK VRLAVG SGLRPDTWER FVRRFGPLQVLET YGLTEGNVAT INY 
TGQRGAVGRASWLYKHIFPFSLIRYDVTTGEPIRDPQGHCMATS 
PGE PGLL VAP VSQQS P FLG YAGGPE LAQG KLLKDV FRPGDVFPW 
TRDLLVCDDQGFLRFHDRTGDP FRWKGENVATTE VAE V FE ALDF 
LQEVWVYGVTV 




255 


231 


PAYWLKVPTLCLESKTDLREKASHVSAQLQGEVRGLAGALWM*A 
YVYER VYN * N IS RMVHALEQKRHPAGLS S S MALQLNP CLGMLMA 
LQSELHKLYDEETQSWVSGSACGGYP 


5741 


1 


650 


PRKTMRRGVLMTLLQQSAMTIiPLWlGKPGDR PP P LCGAIPASGD 
YVARPGDKVAARVKAVDGDEQW 1 LAEWS YSHATNKYE VDD1DE 
EGKERHTLSRRRVIPLPQWKANPETDPEALFQKEQLVLALYPQT 
TCFYRALIHAPPQRPQDD YS VLFEDTS YADG YS P PLNVAQRYW 
ACKEPKKK*CRLADSPSPNDTGQDSRGRAGIKHIPPLKKK 


5742 


2 


362 


TQS VKE I LKRNPNVNLTDKDGNTALM I ASKEGHTE I VQDLLDAG 
ITVNI PDRSGDT VLIGAVRGGHVE I VRALLQKYADID1RGQDNK 
TALYWAVE KGNATMVRDI LQCNPDTE I CTKDG 


5743 


2 


415 


GKTPEGIDAIEEIEIDLEETEREISPQENGLEEVKPLGEMQTDL 
KATGRE I S PREKTPEVIDATEE IDKDLEETGRREIS PEENGPEE 

VKPVDEMETDLKTTGREGSSREKTREVIDAABVIETDLEETERE 
ISPQE 


5744 


3 


703 


TRRTTTTS PTTTRQMTTTPAALPTT WTTPDLTTGT PLQMTT I A 
VFTTANrCLSLTPSTLPEEATGLLTPEPSKEGPIIiTAESETVLP 
SDSWSSAESTSADTVLLTSKESKVWDLPSTSHVSMWKTSDSVSS 
PQPGASDTAVPEQN KTTKTGQMDG I PMS MKNEMP I SQLLM 1 1 AP 

SLGFVLFALPVAFLLRGKLMETYCSQKHTRZJDYIGDSKNVLNDV 
QHGREDEDGLFTL 


5745 
574 6 


1400 


599 


GKSRFVNLMKHSKKTYDSFQDELEDYIKVQKARGLEPKTCFRKM 
KGDYLETCGYKGEVNSRPTYRMFDQRLPSETIQTYPRSCNIPQT 
VENRLPQWLPAHDSRLRLDSLSYCQFTRDCFSEKPVPLNFNQQE 
L ^ b Hlj VSMRVYKH FSSDNS TS THQASHKQ IHQKR KRHPEEGR 
EKSEEERSKHKRKKSCEEIDLDKHKSIQRKKTEVEIETVHVSTE 

KLKNRKEKKSRDWSKKEERKRTKKKKEQGQERTEEEMLWDQSI 
LGF 


5747 


3 


821 


Sr'ASGRLTPSSPAFDGELDLQRYSNGPAVSAWSLGMGAVSWSES 
RAGERRFPCPVCGKRFRFNSILALHLRTHQPERPRSPAARLLLE 
LEERAL LRE ARLGRARS S GGMQAT PATEGLARPQAPSS S AFRC P 
YCKGKFRTS AERERHLHI LHRPW KCGLCS FG S SQEEELLHHSLT 
AHGAPERPLAATSAAPPPQPQPQPPPQPEPRSVPQPEPEPQPER 
EATPTPAPAAPEEP PAP PEFR CQVCGQS FTQSVJFL KGHMRKHKA 
SFDHACPV 




2 


1328 ] 
] 
1 


0RHVETL C IHFLGPSTGS TAKTGGRNWLKTGNCL YGNTCR^VHG 
PSPRGKGYSSNYRRSPERPTGDLRERIKNKRQDVDTEPQKRNTE 
SSSSPVRKSSSRGRHREKEDIKITKERTPESEEENVEWETNRDD 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Bo 
Glutamic Acid, F=Phenyl alanine, G^Glyeine, 
H»Histidine, I^Isoleucine, K=Lysine, 
L= Leucine , M=Mer.hionine , N=*Asparagine , 
P=Proline, Q=:Glutamine, R=Arginine, 
S»Serine, T=Threonine, V~Valine, 
W^Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SDNGUINYDYVHELSLEMKRQKIQRELMKLEQBNMEKREEillK 
KEVSPEWRSKLSPSPSLRKSSKSPKRKSSPKSSSASRKDRKTS 
AVSSPLLDQQRNSKTNQSKKKGPRTPSPPPPIPEDIALGKKYKE 
KYXVKDR I E E KTRDGKDRGRDFERQREKRDKPRSTS PAGQHHS P 
ISSRHHSSSSQSGSSIQRHSPSPRRKRTPSPSYQRTLTPPLRRS 
AS PYPSHSLSS PQRKQS PPRHRSPMREKGRHDHERTSQSHDRRH 
ERREDTRGKRDREKDSREEREYEQDQSSSRDHRDDREPRDGRDR 
RE 


5748 


934 


473 


SEGPQVFYKGLAPTLIAIFPYAGLQFSCYSSLKHLYKWAIPAEG 
KKNENLQNLLCGSGAGVIS KTLTYPLDLFKKRLQVGGFEHARAA 
FGQVRRYKG LMD CAKQVLQ KEGALGFFRGLS PS LLKAAXiS TGPM 
FFSYEFFCNVFHCMNRTASQR 


5749 


552 


1 


GFPVDPRVRGSTLSLAERPKGMIRSGS FRDPTDDVHGS VLSIAS ' " 
SASSTYSSAEERMQSEQIRKLRRELESSQEKVATLTSQLSANAN 
LVAAFEQSLVWMTSRLRHLAETAEEKDTELLDLRETIDFLKKKN 
SBAQAVIQGALNASETTPKELRI KRQNS SDS I S SLNS I TSHS S I 
GSSKDADA 


5750 


22 


866 


IFISICLWKAHLCFLLLPKDCIDQVMKLQNLFVDDSGRYIiAIQF 
IILEWAYVFLYYYEYRKAKDQIiDIAKDISQLQIDLTGALGKRTRF 
QENYVAQLILDVRREGDVLSNCEFTPAPTPQBHLTECNLELNDDT 
ILNDIKIADCEQFQMPDLCAEEIAIILGICTNFQKNNPVHTLTE 
VELLAFTS CLLSQ PKFWAIQTSALIIiRTKLEKGSTRRVBRAMRQ 
TQAIADQFEDKTTSVLERLKIFYCCQVPPHWAIQRQLASLLFEL 
GCTSSALQIFEKIjEMWE 


5751 


3 


751 


SCGSALRAWRCGAAALATFPAPALPGLMYRALYAFRSAEPNALA 
FAAGETFIiVLERS SAHW WIAARARS GETGYVPPAYLRRLQGLEQ 
D VLQAI DRAI EAVHNTAMRDGG KYS I»EQRGVLQKL IHHRKETLiS 
RRGPSASSVAVMTSSTSDHH1J3AAAARQPNGVCRAGFERQHSIjP 

ssehlgadgglfqi plpssqi ppqprraapttppppvkrrdrea 
lmasgsgghntmpsggnsvssgssvssci 


5752 




471 


gpvcgvglsvawagpwrgpvhsvggggraalhgaelpcojsgaat ' " 
veremelrkknemlrveteararakaerenadi ireqirlkase 
hrqtvlesirtagtlfgegfrafvtdrdkvtatvwifikqgwqv 
aerqhvgaswsprscpcrlctal 


5753 


34 


4B3 


DDSXAI PGGVQAP FGAVRN I YTPRTGHRIRKLDQ I QSGGKYVAG 
GQEAFKKLNYLDIGEIKKRPMEWNTEVKPVIHSRINVSARFRK 
PLQEPCTIFLIANGDLINPASRLLIPRKTLNQWDHVIjQMVTEKI 
tlrsgavhrlytlegrlv 


5754 


14 


331 


TLVHVVEFAGEHAEAIASREQEVIjQGWKELLSACEDARIiHVSST 
ADALRFHSQVRDIiLSWMDGIASQIGAADKPRCPSSLLGLPASPW 
WPTPATPS PLTAPFSME 


5755 


3 


888 


lgdqfykeaiehcrsynsrlcaersvrlpfldsqtgvaqnncyi 
wmekrhrgpgiapgqlytyparcwrkkrrlhppedpklrlle I K 
pevelplkkdgftsesttleallrgegvekkvdareeesxqeiq 
rvlendenveegneeedleedi p krknrtrgrargs aggrrrhd 
aasqedhdkpyvcdi cgkryknrpgls yhyahthias eegdeaq 

DQETRSPPNHRNEiraRPQKGPDGTVIPNITYCDFCLGGStJMNKKS 

grpeelvscadcgrsahlggegrkekeaaa 


5756 


3 


621 


SSKDQALFAHPLYNVPEEPPLIiGAEDSLIASQEALRVVRRKVAR 
WNRRHKMYREOMNT.T^T J nPPT.OTiT?r.PaQ&JVnP"-TT/lTTJDljr t r von 

* ■*■ **"0^£*'J1.1I JJ J. iJ UiSiT XT 2-ilj£J-ll\ riV\£C *1JJN7 X Vi xCtHjfij ISA 

SSPWSKLLQDMRHFPTXSADYSQDEKALLGACDCTQIVKPSGV 
HLFCLVLRFSDFGKAMFKPMRQQRDEETPVDFFYFIDFQRHNAEI 
AAFHLDRILDFRRVPPTVGR ivnvtkeil 


5757 


3 


473 


YKDALLIiPDNHRQWFENGTIjKLTDVQKGMDEGEYIiCSVLIQPQ 

lsisqsvhvavkvppliqpfefppasigqiilyipcwssgdmpi 
ritwrkdgqviisgsgvtieskefmsslqissvslkhngnytci 
asnaaatvsrerqlivrvpprfw 


5758 


1 


474 


frrgagaergehregergaagmgefkvhrvrffnyvpsgxrcva 
ynnqsnrlavsrtdgtve I ynlsawyfqekffpghesratealc 
waegqrlfsaglngeimeydlqalnxkyamdafggpiwsmaasp 
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0C.1J 

ID 
NO: 


jp re dieted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of . 
amino acid 
seguence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glufcamine, R=Arginine, 
SeSerine, T^Threonine, V«Valine, 
W=Tryptophan, Y«Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5759 


2 


1240 


■www ■ VJ<-j*-»*^tJt3 V XVXJi* W«L X CURiX f V 

gnaafagqgvvyetfhmsdlpsVttngtvhwvnnqigfttdpr 

MARS S P Y P TDVAR WNAP I FHVNADDP E AV I YVCS VAAE WRNT F 
NKDVGADLVC YRRRGHNEMDE PMFTQPLM YKQ I HRQVP VLKK YA 
DKL IAEGTVTLQEFEEE I AKYDRI CEEAYGRSKDKKI LHIKHWL 
DS PWPG FFNVDGEP KS MTCPATGI P EDMLTHIGS VAS S V? LEDF 
KIHTGLSRILRGRADMTKNRTVDWALAEYMAFGSLLKEGIHVRL 
NGQDVERGTFSHREHVLHDQEVDRRTCVPMNHLW PDQA P YTVCN 
SSLSEYGVIjGFE^YAMASPNALVLWEAQFGDFHNTAQCIIDQF 
ISTGQAKWVRHNGIVLLLPHGMEGMGPEHSSARPERFLQMSNDD 
SDAYPAFTKDF3VSQL 


576C 


1 


1221 


VRDI TSD SLS LS WT VP EGQ FDK FLVQFKNGDGQ P KAVRVPGHE D 
GVTI SGLE PDHKYKMNL YGFHGGQR VGPVS AVGLTAPGKDEEMA 
PAS TE PPTP EP ? I KPRLEE LTVTDAT PDSLS LS W TVP EGQ FDHF 
LVQYKNGDGQPKATRVPGHEDR VTI SGLE PDNKYKMNLYGFHGG 
QRVGPVSAIGVTAAEEETPTPTEFSMEAPEPPEEPLLGELTVTG 
SS PDSLS LS WTVPQGR FDS FTVQ Y KDRDGR PQ WRVGGEES E VT 
VGGLEPGRKYKMHLYGLHEGRRVGPVSTVGVTAPQEDVDETPSP 
TEP GTE APBP PEE PLLGE L TVTGS S PDSLS LS WTVPQGRFD S FT 

VQYKORDGRPQAVRVGGQESKVTVRGLEPGRKYKMHLYGLHEGR 
RT iGP VQ ZV TfZ\T'V 
rvjjor v ^J\x\s V X 


5761 


3 


1275 


SCDMAE AAAJj VW I RGPGFGCKAVRCASGRCTVRD^ i HRHCQDQN 
VPVENFFVKCNGALINTSDTVQHGAVYSLEPRLCGGKGGFGSML 
RALGAQ IEKTTNREACRDLSGRRLRDVNHEKAMAEWVKQQAERE 
AEKEQKRLERLQRKLVE PKHCFTS PD YQQQCHEMAERLEDSVLK 
GMQAASSKMVSAE I SENRKRQWPTKS QTDRGAS AGKRRCFWLGM 
EGLETAEGSNS ES S DDDS EEAPSTS GMGFHAP K I G SNGVEMAAK 
t tfb^bUKAKVVNTDHGS PEQLQI PVTDSGRH ILEDSCAELGES FC 
EHKESRMVTETEETQEKKAESKEPIEEEPTGAGLNKDKETEERT 
DGERVAEVAPEERENVAVAKLQESQPGNAVIDKETIDLLAFTSV 
AELELLGLEKLKCELMALGLKCGGTLQ 


5762 


2 


344 


t» & ixsu x fijHfamatGGGGSGGGRRRTPRGM PKEKYE P PD PRRM YT I 
MSSEEAANGKKSHWAELE ISGKVRSLSASLWSLTHLTALHLSDN 
SLSRI PSD I AKLHNLVYLDLSSNKIR 


5763 


3 


429 


LDKDTGLI ML IARLD YELI QRFTLT 1 I ARDGGGEETTGR VR I W V 
LDVNDNVPTFQKDAYVGALRENE PSVTQLVRLRATDEDS PPNKTQ 
I T YS I VSASAFGS YFD I SLYEG YG VI S VSRPLDYEQ I SNGL I YL 
TVMAMDAGN 


5764 
57^ - 


19 


441 


v i. JJKyK YDBNEDlibJDViiEI VS VRGFSLEEK 
LRSQL YQGDFVHAMEGKDFNYE YVQREALRVPLI FRE KDGLG I K 
MPDPDFTVRD VKLLVGS RRLVDVMDVNTQKGTEMSMS QFVR YYE 
TPEAQRDKL 




3 


825 


QKILRU^SHQPPTSSSNSKDCGGPASSGAGATAALADfiLKFAS 
VQAS APQGNS HKETS KS KVKRS KTS KDANKS LPSAAL YGI PE X S 
STGKRQE VQGR PGE ATGMNSALGQS VS SGGS GNPNSNSTS TSTS 
AATAGAGS CGKSKEEKP GKSQS S RGAKRDKDAGKSRKDKHDLLQ 
GHQKGSGSQAPSGGHL YGFGAKSNCMna e? wvrvznrrr* or* cir7iK* 

GEVS KSAPDSGLMGNSMLVKKEEEEEESHRR I KKLKTEKVDPLF 
TVPAPPPHV 




1608 


663 


SGLF SVDPASSQAMBLSD VTL t EGVGNEVMVVAGVVVLILALVL 
AWLSTYVADSGS NQLLGAI VSAGDTS VLHLGHVDHLVAGQGNPE 
PTELPHPSEGNDEKAEEAGEGRGDSTGEAGAGGGVEPSLEHLLD 
IQGLPKRQAGAGSSSPEAPLRSEDSTCLPPSPGLITVRLKFLND 
TEELAVARPEDTVGALKSKYFPGQESQMKLIYQGRLLQDPARTL 
RSLNITDNCVIHCHRSPPGSAVPGPSASLAPSATEPPSLGVNVG 
SLMVPVFWLLGVVWYFRINYRQFFTAPATVSLVGVTVFFS FLV 
FGMYGR 


5767 


2 


892 " 
i 

1 


^FRATPRPPTliPiSLRTGTEVILWYLDWRALMKRKRMKANIKLVG 
3G FPLPSSDLDDSLTEE IDE fCIG FRNDANFDWQNVADFRDAGGS 
jTEVKVEEEERDPQSPEFEIEEEEEMLSSVIPDSRRENELPDFP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V^Valine, 
WsTryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








HIDEFFTLWSTPSRSAYDfipfJLLVNIEKQKIiELEKRRLDIEAEB 
LQVEKERLQIEKERLRHLDMEHERLQLEKERLQIEREKLRLQIV 
NSEKPSLENELGQGEKSMLQPQDIETEKLKLERERLQLEKDRLQ 
FLKF3SEKLQ IEKERLQVEKDRLRI QKEGHLQ 


5768 


3 


476 


SSRSRLSVSVSPPPPGIVBLGPPFAWEFCSRLGSAVTSQRAGPA 
AAMVAKDYPFYLTVKRANCSLELPPASGPAKDAEEPSNiCRVKPL 
S RVTS LANL I P P VKAT P LKRFSQTLQRS ISFRSESRPDI LAPR P 
WSRNAAPSSTKRRDSKLWSETFDVC 


5769 


38 


667 


TK^KKG VKE KATDQS VKAFAEHC PE LQ YVGFMGCS VTS KG VIHL 
TKLRNLSS LDIiRHITELDNETAME I VKRCKNLI S LNLCIiNWI IN 
DRCVE VIAKEGQNLKEL YLVSCKI TDYALI AIGR YSMTI ETVDV 
GWCKE I TDQGATLI AQS S KS LR Y1*GLMRCDKVNE VTVEQLVQQ Y 
PHITFSTVLQDCKRTLERAYQMGWTPNMSAASS 


5770 


1 


484 


DSRRYDVKTRKWSFLLEEHSKLIAKVRCLPQVQLDPLPTTLTLA — 
FASQLKKTSLSLTPDVPEADLSEVDPKLVSNLMPFQRAGVNFAI 
AKGGRLLLADDMGLGKT IQAI CI AAFYRKEWPLLVWPSS VRFT 
WEQAFLRWIiPSLS PDCINWVTGKDRLTA 


5771 


168 


741 


GLLPSACLRARSWREASEGPSSRACSNGSQDTFEACYSGTSTPS 
FHGSHCSGSDHS S LGLEQLQDYM VTLRSKLGPLE IQQFAMLLRE 
YRLGL P I QD YCTGLLKL YGDRRKFLLLGMRP F I PDQDIGYFEGF 
LEGVG IREGGILTDS FGRI KRSMSSTSASAVRS YDGAAQRPBAQ 
AFHRLLADITHDIE 


5772 


148 


383 


EFNLALVSPSHPQIKAEDDQPIjPGVIiIjSIjSGGIjFRSNI4I.TQDNG 
ILTFSNLVTCSAI YHLPVFPERE PGCSMRDLRVA 


5773 


2 


723 


prvrskhnfcfmemotrlqvehpvtemitgtdlvewqiJr"iaage" - ' w 

kiplsqeeitlqghafeariyaedpsnnfmpvagplvhlstpra 

dpstrietgvrqgdevsvhydpmiaklvvwaadrqaaltklrys 

i*rq yni vglhtntdpllnlsghpe feagnvhtdfi pqhhkqlhh 

srkaaakeslcqaalglilkekamtdtftlqahdqfspfssssg 

rrlnisytrnmtlkdgknsk 


5774 


2 


592 


FVEEENIRWRCGGSELNFRRAVFSADS KYI FCVSGDFVKVYST 
VTEECVHILHGHRNLVTGIQLNPNNHLQLYSCSLDGTIKLWDYI 
DG I L I KTFI VG C K LHALFTIAQAEDS VFVTVNKE KPD 1 FQL VS V 

KLPKSSSQEVEAKELSFVLDYINQSPKCIAFGNEGVYVAAVREF 
VLSVYFFKKETTSRVTLSSS 


S77S 


3 


53 8 


SSGCCDPAAPSSLAEAATMPVSKCPKKSESLWKGWDRKAQRNGt, 
RSQVYAVNGDYYVG EWKDNVKHGKGTQVWKKKGAI YEGDWKFGK 
RDGYGTL.SLPDQQTGKCRRVYSGWWKGDKKSGYGIQFFGPKEYY 
EGDWCGSQRSG WGRMY YSNGD I YEGQ WENDKPNG EGMLRLS QN P 
RP 


5776 


2 


484 


RLPQDCVCQNLSESL/3TLCPSKGLItFVPPDIDRR.TVELRLGGNF 
IIHISRQDFANMTGLVDLTLSRNriSHIQPFSFLDLESLRSLHL 
DSNRLPSLGEDTLRGLVNLQHLIVNlJNQLGGIADEAFEDFLIiTL 
EDLDLSYNNLHGPAVGLRGDAWVQPSTS 


5777 


2 


949 


GQDPE PGQDLFQ PERE VD P S WGRGREPRLGKI»RFQN DHLSVLKQ 
VXKLEOALKDGSAGLDPQLPGTCYSPHCPPDKAEAGSTLPEITLG 
GGSGSEVSQRVHPSDLEGREPTPELVEDRKGSCRRPWDRSLENV 
YRGSEGSPTKPFINPI>PKPRRTFKHAGEGDKDGKPGIGFRKEKR 
NLPPLPSLPPPPLPSSPPPSSVNRRLWTGRQKSSADHRKSYEFE 
DLLQSS SESS RVDW YAQTKLGLTRTLSE ENVYED ILD P PMKEKTP 
YEDI ELHGRC LGKKCVLNFPAS PTSSI PDTLTKQSLS Kp AFFRQ 
NSSRRNV 


5778 " 


1 


1210 


QRRQSVSRLLLPVFLLEPPAEPGLEPPPEEEGGEPAC^VAEEPGS" 
GGPCWLQLEEVPGPGPLGGGGPLRS PSSYSSDELS PGEPLTS P P 
WAPLGAPERPEHLLKR VLERLAGGATRDS AAS D ILLDDI VLTHS 
LFLPTEKFLQELHQYFTOAGGMEGPEGIiGRKQACLAMLtiHFLDT 
YQGLLQEBEGAGHI I KDLYLLIMKDBSLYQGLREDTLRIiHQLVE 
TVELKIPBEWQPPSKQVKPIiFRHFRRIDSCLQTRVAFRGSDEIF 
CRVYMPDHS YVTIRSRLSASVQD I LGSVTEKLQ YSEEPAGREDS 
LILVAVSSSGEKVLLQPTEDCVFTALGIWSHLFACTRDSYEALV 
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SEQ 
ID 
NO: 


Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucl eo t i de 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptlcCe" - 
(A=Alanine, ^Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F=*Phenyl alanine, G*Glycxne, 
H=Histidine, I«Isoleucine, K«Lysine, 
L^Leucine , M~Methioninc, N=Asparagine , 
Pa Proline, Q>Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyro sine, X= Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








plpeeiqvsi>gdteihrvepedvanhltafhwelprcvhh;lefv 

DYVFHGE 


5779 


138 


1571 


EAVQVLI KH S AD VNAR D KNWQT PLHVAAAN KAVKCAE VII PLLS 
S VNVS DRGGRTALHHAALNGHVEMVNLLLAKGAN I NAFDKKDRR 
ALHWAAYMGHLDWALLINHGAEVTCKDKKG7TPLHAAASNGQI 
NWKHLLNLG VE I DE Z NV YGNTALH I ACYNGQDAWNE L IDYGA 
NVNQPN^GFTPI^PAAASTHGALCLELLVNNGADVNIQSKDGK 
SPLHMTAVHGRFTRSQTLIQNGGEIDCVDKDGNTPLHVAARYGH 
EXXINTLITSGAIXTAKCGIHSMFPLHIiAALNAHSDCCRKLLSSG 
QKYS I VSIiFSNEHVLSAG F E I DT PD KFGRTCLHAAAAGGNVECI 
KLLQSSGADFHKKDKCGRTPLHYAAANCHFHCIETLVTTGANVN 
ETDDWGRTA LHYAAAS DMDRNKTI LGNAH DNS EELERARELKE K 
EATIiCLSFLLQNDANPS I RD KEG YNS IH YAAAYGHRQCLE LLLE 
RTNSG F3ESDSGATKS PLHLAVS EM P 


5780 


154 


624 


UFFRVITCJbPFKGPDYRLYKSEPELTrVAEVDESNGEEKSEPVS 
EIETSWKGSHFPVGWPPRAKSPTPESSTIASYVTLRKTKKMM 
DLRTER PRSAVEQL CLAE STRPRMTVE EQMER I RRHQQACLREK 
KKG.LNVIGASDQSPLQSPSNI.RDNP 


5781 
5782 


19 


941 


RGSLGGHPwr^PMRAASQGCIjPVSFVI'UPHQERAYGGRGPGGAF" 

PAPPVSGTCPPDLIYAPTPEJCAEGGSQKNHQPPPGERAAHRDGE 

QAPCRAGPTRKVAVAPRPPSCP*GPE\PGEEPRRPLDRSPPLGQ 

VQPHFTSQDAKSAEDEAPSRHLGKHQPRSAQVGSRLDALQGPKT 

QHS1HTVTCKSPRQKEDRSPKPPQAPKHPEBHGRQS\QAPPPLP 

VAPSRT03GC*TWDPALLVSP/PQGDSTPELPAP\QQPTGGPSR 

CRQALPPQG*RQQPRQRPR/PTGASRSHPAKAKGCQGPPKIRNY 
NIMD 


5783 


5176 


1237 


drsmmsmaadsytdsytdtyteaymvpplPpeepptmpplp'pee - 

PPMTPPLPPEEPPEGPALPTEQSALTAENTWPTEVPSLPSEESV 
SQPEPPVSQSEISEPSAVPTDYSVSASDPSVLVSEAAVTVPEPP 
PEPESS1TLTPVESAWAEEHEWPERPVTCKVSETPAMSAEPT 
VLASBPPVMSETAETFDSMRASGHVASEVSTSLLVPAVTTPVIiA 
ESILEPPAMAAPESSAMAVLESSAVTVIjESSTVTVLESSTVTVL 
EPSWTVPEPPWAEPDYVTI PVPVVSALEPSVPVLEPAVSVLQ 

psmivsepsvsvqestvtvsepavtvseqtqviptevaiestpm 
iless imsshwkgi nlssgdqnlape i gmqeialhsgeephae 
ehlkgdfyesehgiwidlninnhliakemehntvcaagtspvge 

IGEEKILPTSETKQRTVLDTYPGVSEADAGETLSSTGPFALEPD 

atg\tskgisfttastlslvnkydvdlslttqdtehdmlistsp 

SGGSEADIEGPLPAKDIHLDLPSNINLVSSDTNEPLPVKRD\DQ 
TLAALI \SLKESSGGEKEVPPPS *"REHLPDSGFSAWIEDINEAD 
LVRPVSSPRTWNVLPSPRAGIj\EGP\LLASDFGPVQNI*YSSPW 

\ssmp\erasgs\ssgekgg\yeifvkvkdthekskkhknrdkg 

E KEKKRDS S LRSRS KRS KSS EHKS RKLTSESRSRARKRSSKS ICS 

hrs\qtrsrsrs/rdrrrrssrsrsksrgrrsvskekrkrspkh 
rsksrerkrkrsssrdnrktvrarsrtpsrrsrshtpsrrrrsr 
svgrrrsfsispsrrsrtpsrrsrtpsrrsrtpsrrsrtpsrrs 

RT PS RRSRTPSRRRRS RS WRRRS FS IS P VRLRRSRTPLRRRFS 

rspirrkrsrssergrspkrltdldkaqlleiakanaaamcaka 

GVPLPPNLKPAPPPT1 EEKVAKKSGGAT IEELTEKCKQ I AQSKE 

DDDVIVNKPHVSDEEEEEPPFYHHPFKLSEPKPIFFNLNIAAAK 
PTPPKS QVTLTKEFP VSSGSOHR KKEADS VYn FWUDUp vwr- ppm 

KDDDNWSSNLPSEPTOISTAMSERALAQKRLSENAFDLEAMSM 
LNRAQERIDAWAQLNS I PGQFTGSTGVQVLTQEQLANTGAQAW1 
KKDQFLRAAPVTGGMGAVLMRIO-IGWREGEGLGKMKEGNKEPILV 
DFKTDRKGLVAVGERAQKRSGNFSAAMKDLSGKHPVSALMEICN 
KRR WQP PE FLLVHDSG PDHRKHFLFR VLINGSAYQPNCMFFItNR 
Y 




1693 

- 


< 
] 

i 


□SGLRVAFTMEG ISNFKTPSKLSE KKKSVLCSTPTINI PASPFM 
2KLGFGTG VNVYI*MKRS PRGLSHS PWAVKKINP I CNDH YRS VYQ 
<RLMDEAKILKSLHHPNIVGYRAFTEANDGSLCLAMEYGGEKSL 
TOLIEE/PI*SQ/PKII»FQQP/LILKVALNMARGIjKYIjHQEKKIj 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c or re sp o nd i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amxno acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine , G=Glycine, 
n-n-Lsta.aa.ne, i^isoieucine, K» Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S= Serine, ^Threonine, V== Valine, 
^Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LHGD I KS SNWI KGDFBT I KI C DVGVSLP LDENMT VTD PEACYI " 
GT EPWKPKEAVE ENG VITD KAD I FAFGLTLWEMMTLS I PHINLS 
NDDDDED KTFDES DFDDEA Y YAALGTRP P INME ELDES YQKVI E 

T.l?cur w 7 , MT?nnin^tiTve% AiTTTinn » — _ 


5784 


2669 


1388 


PRVRPRVRTDHNYYISRiyGPSDSASRDLWV^IDQMEKDKVKIH 
GXLSNTHRQAARVNLSPDPPFYGHFLRE I TVATGGFI YTGE WH 
RMLTATQ Y IAPLMAN F DPS VSRNS TVRY FDNGTAL VVQWDHVHL 
QDNYNLGS FTFQATLLMDGRI I FG YKE I PVL VTQ I SS TNHPVKV 
GLS DAF VWHRIQQ I PNVRRR? I YEYHR VELQMS KI TN I S AVEM 
TPL PTCLQFNRCG P CVS S QI G FNCS WCS KLQRCS S GFDRHRQDW 
VDSGCFEESKEKMCENTEPVET\FtiEPPQP*SRQPPSSGS*LPP 
E/DAVTSQFPTSLPTEDDTKIALHLKDNGASTDDSAAEKKGGTL 
HAGLI VG 1 LI LVL I VATA IL VTVYM YHH PTSAAS I FFI ERRFSR 
WPAMKFRRGSGHPAYAEVEPVGEKEGFIVSEQC 


5765 


2659 


1388 


PR VR P RVRTDHN YY I SRI YGPSDS ASRDL W VN I DQMEKDKVK I H 
GILSNTHRQAARVNLSFDFPFYGHFLREITVATGGFIYTGEWH 
RMLTATQ Y 1 APLMANFD P S VSRNS TVR YFDNGTAL WQWDHVHL 
QDMTYNLGS FTFQATLLMDGR 1 3 FG YKE I P VLVTQI S STNHP VKV 
GLSDAFWVHRIQQIPNVRRRTIYEYHRVELQMSKITNISAVEM 
TPLPTCLQFNRCGPCVSSQIGFNCSWCSKLQRCSSGFDRHRQDW 

vdsgcpeeskekmcentepvet\ fleppqp*erqppssgs *LPP 

E / DAVTSQFP TSL PTEDDTK IALHLKDKGASTDDSAAE KKGGTL 
HAGLIVGILILVLrVATAILVTVYMYHHPTSAASIFFIERRPSR 
W PAMKFR RGSGHPAYAE VE P VGE KEG F I VSEQC 




2532 


1574 


SYKLPAAERRASSCSQPPTPTRRRWPAPGRTSRGHRPQM*SGTP " 
APRPPARSTVSPASPLPKPRAGRCGSRPRSACSTFRFC*SIiN*M 
S *H * KRNLSQRSS SMSRRPLS CAR PHR * * RQGLTVAARL PT WAK 
SPPLACSFCQAAQKSQSLSSGRSTR*PBRMSFRP\SPPGNPAIP 
SLAPS SRP/PKGRPQCTWIPSRWPASPTAPPTTT*APTSSPGST 
GRSMMTCPTRWTATPWSARASSRPRNWPTP * WRPSGRLS TV* RA 
TGGSTATAP PKRFPRNWNPMMAE 


5787 


2 


1460 


MASAASVTS LADE VNCP \ I CQGTLKEAGSLSNCG/HKNFCRACL ~ 

t\ryce ip \gpd vlees p\ tcp\ lckepfrp\gs frpnwolanv 
venierlqlvstlglgeedvcqehgekiyffceddemqlcwcr 
eagehathtmrfledaa\apyreqikkcbkclikerebiqeiqs 

RENKRMQVLLTQVSTKRQQVISEFAHLRKFLEEQQSILLAQLES 
QDGDI LRQRDEFDLLVAGE I CRFSAL I EELEEKNE RPARELLTD 
IRSTLIRCETRKCRKPVAVSPELGQRIRDFPQQALPLQREMKMF 
LEKLCFELDYEPAHISLDPQTSHPKLLLtSEDHQRAQFSYKWQNS 

pdkpqrfdratc^vlahtgitggrhtwwsidlahggsctvgvvs 
edvqrkgelrlrpeegvwavrlawgfvsalgs fp\trltlkeq p 

rqvrvsldyevgwvtftnavtrepiytftasftrkvipffglwg 
rgssfslss 


5788 


2 

i 


6860 

] 
t 


shsvsgrssaygdataeghpagpgsVssstgaistttghqegdg 
segegegetegdvhtsnrlhmvrlmllerllqtlpqlrnvggvr 
aipymqviiwlttdldgedekdkgaldnllsqliaelgmdkkdv 
skknersalnevhlwmrllsvfmsrtksgskssicesssliss 
ataaallssgavdyclhvlkslleywksqqndeepvatsqllkp 
httssppdmspfflrqyvkghaadvfeaytqlltemvlrlpyqi 
kkitdtnsripppvfdhswfyflseylmiqqtpfvrrqvrklll 
F I cgske kyrqlrdlhtlds \hvrg ikklleeqg I flras wta 
spqsalqydtlislmehlkacaeiaaqrtinwqkfcikddsvly 
fllqvs flvdegvs pvllqlls calcgskvlralaassgsssas 
sspapvaassgqattqsksstkkskkeekekekdgetsgsqedq 
lctalvwqlnkfadketliqflrcfllesnsssvrwqahcltlh 
iyrnssksqqellldlmwsiwpelpaygrkaaqfvdllgyfslk 

TPQTEKKLKEYSQKAVEILRTQiraiLTNHPNSNIYNTLSGLVEF 
DGYYLESDPCLVCNNPEVPFCYI KLSS I KVDTR YTTTQQWKL I 
SSHTIS KVTVKIGDLKRTKMVRTINLYYNNRTVQAIVELKNKPA 
R WHKAKKVQLTPGQTE VKIDLPLP I VASNLMIE FADF YENYQAS 
rETLQCPRCSASVPANPGVCGNCGENVYQCHKCRSINYDEKDPF 



371 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, OCysceine, D^Aspartic Acid, 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H^Histidine, I^Isoleucine, K= Lysine, 
Ii=Leucine, M=MethioirUne, N-Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
SsSerine, T^Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X- Unknown, *=Stop 
Codon, /"-possible nucleotide deletion, 
\»possible nucleotide insertion) 








LCNACGFCKYARFDFMbYAKPCCAVD P IENEEU3RKKAVSNINTL 

LDKADR VYHQLMGHR P QLENLLC KVNEAAPE KPQDDSGTAGG I S 

STSASVNRYILQLAQEYCGDCKNSFDELSKXIQKVFASRKELLE 

YDLQQREAATKSSRTSVQPTFTASQYRALSVLGCGHTSSTKCYG 

CASAVTEHCITLLRALATNPALRHILVSQGLIRELFDYNLRRGA 

AAMREEVRQLMCLLTRDNPEATQQMNDLIIGKVSTALKGHWANP 

DLASSLQYEMLLLTDS ISKED5 CWELRLRCALSLFLMAVNI KTP 

VWENITLMCIiRIIiQKLI KPPAPTSKKNKDVPVEALTTVKPYCN 

E I HAQAQIiWf LKRDP KAS YDAWKKCLP IRG IDGNGKAPS KS ELRH 

liYLTEKYVWRWKQFLSRRGKRTSPLDLKLGHNNWLRQVLFTPAT 

QAARQAAC7 1 VEALAT I PSRKQQVLDLLTS YLDELS I AGECAAE 

YliALYQKLZTSAHWKVYLAARGVLPYVGNLITKEIARLLALEEA 

TLSTDLQQGYALKSLTGLLSSFVEVESIKRHFKSRLVGTVLNGY 

LCLRKLWQRTKLI DBTQDMLLEMIjEDMTTGTESETKAFMAVC I 

ETAKRYNLDDYRTPVFIFERLCSIIYPEENEVTEFFVTIiEKDPQ 

QEDFLQGRMPGNPYSSNEPGIGPLMRDIKNKICQDCDLVALLED 

DSGMELLVNNKI ISLDLPVAEVYKKVWCTTNEGEPMRIVYRMRG 

LLGDATEE FI ESLDSTTDEEEDEEE VYKMAGVMAQCGGLECMLN 

RLAGIRDFKQGRHI^TVLLKLFSYCVKVKVNRQQLVKLEMNTIiN 

VMLGTLNLALVAEQES KDSGGAAVAEQVLSIME I \ IQAEPNVEP 

LSEDKGNLLLTGDKDQLVMLLDQ1NSTFVRSNPSVLCGLLRIIP 

YLSFGEVEKMQIIiVERFKPYCNFDKYDEDHSGDDKVFIf\DCFCK 

I AAG I K\NNSNGHQL\ KDL \ ILQXG ITQNALD\ YMKKHI P /SAA 

RI WDADI \WKS FCLRPALP FILRLLRGUAIQHPGTQVL IGTDS I 

PNX»HKLEQVS\SDEGIGTLA\ENI»\LESLREHPDVNKKIDA\AR 

RETRAEKKRMAMAMRQKALGTLG \MTT^EKGQWD/TRTALLEA 

DWEEIiIEEP\GLTCCICREGYKFQPTKVLGlYTFTKRWLGGVW 

ENKPRETSRATSTVSHFNI VH YDC \HLA\AVS LARGREEWESAA 

LQNANT KCNGLLP VWG PHVPE S AFATC LARHNTYLQECTGQREP 

TYQLWrKDIKLLFLRFAMEQSFSADTGGGGRBSNIHtilPYIIHT 

QLYVIiNTTRATSREEKNLQGFIjEQPKEKWVESAFEVDGPYYFTV 

LALHI bPPEQWRATRVEIIiRRIjLVTSQARAVAPGGATRLTDKAV 

KDYSAYRSSLLFWALVDLIYNMFKKVPTSE5TEGGWSCSLAEYIR 

HNDMPIYEAADKALKTFQEEFMPVETFSEFLDVAGLLSEITDPE 

SFLKDLLWSVP 


5789 


1 


2407 


LPLHAVEKTGRPGQPALKMPGKIiRSDAGLESDTAMKKGETLRKQ 
TEEKEKKE KPKSDKTEE I AEEEETVFPKAKQVKKKAEPSEVDMN 
S PKSKKAKK\ KE E PSQNDI S PKTKS LRKKKEP I E KKWSSKTKK 
VTKWEEPSEEEIDAPKPKKMKKEKEMNGETREKSPKLKNGFPHP 
EPDCNPS EAASEESNSEIEQEI PVEQKEG\AFSNFPISEETIKXt 

l kgrgvtflfp i qaktfhhvysgkdli aqartgtgkt7s fai pl 
ieklhg\el>qdrkrgrapqvlvlaptrelanqvskdfsditkkl 

S VACFYGGTPYGGQ FERMRNGI DILVGTPGRI KDH I QNGKLDLT 

klnhwldevdqmldmgfadqveeilsvaykkdsednpqtllfs 
atcphwvfnvakkymkstyeqvdligkktqktaitvehlaikch 
wtqraavigdvirvysghqgrtiifcetkkeaqelsqnsaikqd 
aqslhgd i pqkqrei tlkgfrngs fgvlvatnvaargldipevd 
lviqssppkdvesyihrsgrtgragrtgvcicfyqhkeeyqlvq 

VEQKAG I KFKRIGVPS ATE 1 1 KASS KDAI RLLDS VP PTA ISHFK 
QS AEKLIEEKGAVEALAAALAH ISGATSTOQRSL INSNVGFVTM 

TLOf^flTTTMDWTCVfiuvcT lrt?riT f*WTr\ov\rvnvnroT tr-r nTmn 
■* J.xji'ifn j. a x AW J\JiJUKJtiyjjt»liiii X UfaltVKGMVPLiKQr^jGVCF 

DVPTASVTEIQEKWHDSRRWQLSVATEQPELEGPREGYGGFRGQ 
REGSRGFRGQRDGNRRFRGQREGSRGPRGQRSGGGNKSKRSQNK 
GQKRS FS KAFGQ 


5790 


3786 


1585 


ARRQRDPLQAIjRRRNQEIiKQQVDSLLSESQLKEALEPNKR^HI Y 
QRCI QLKQAIDENKNALQKLS kadesap vanynqrkeeehtlld 
KLTQQLQGLAVTISRENI TEVGAPTEEEEES ESEDSEDSGGEEE 
daeeeeeekeeneshkwstgee y IAVGDFTAQQVGDLTFKKGE I 
LLVIEKKPDGWWIAKDAKGNEGLVPRTYLEPYSEEEEGQESSEE 
GSEEDVEAVDETADGAEVKVQRTDPHWSAVQKAISEAGIFCLVN 
HVS FCYL I V1MRKRMETVEDTNGS ETGFRAWNVQSRGR I FLVSK 
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SEQ 
ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



S7B1 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containin g signal peptide - ] 
<A=Alanine, C=Cysfceine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glyci ne , 
H=Histidine, I^Isoleucine, X=Lysine, 
L«Leucine, M=Methionine, N^Asparagine, 
P-Proline, Q^Glut amine, R=Arginine, 
S=Serine, ^Threonine, V«Valine, 
"-Tryptophan, Y-Tyrosine, X=UnJcnown, *=Stop 
Codon, /=possible nucleotide deletion, 



1636" 



5792 



5793 



2263 



653 



5794 



653 



\=possible nucleotide inserti on) 
PVLQQIMTVu VJjT'X'MGAI P AG F R PSTLSQLL KKGNQFRAN Y FLQ ~ 
PE LMPS QIAKKDLM WD ATEGTIRS RPSR I SL ILTLWS CKM I PL P 
GMSIQVLSRHVRLCLFDGNKVLSNIHTVRATWQPKKPKTWTPSP 
QVTRILPGLLDGDCFIRSNSASPDLGILFELGISYIRNSTCBRG 
ELSCGWVFLKLFDASGVPIPAKTYELFLMGGTPYEKGIEVDPSI 
SRRAHGSVFYQIMTMRRQPQLLVKLRSLNRRSRNVLSLLPETDI 
GNMCSIHLLIFYRQILGDVLLICDRMSLQSTDLISHPMLATFPML 
LEQPDVMDALRSSWAGQES\TLKRSEKR\PK3FLKVPRFLLVYH 
NGCVLPLL/HTPTRLPPFRWAEEErETARWKVITDFLKQNQENQ 
GALQaLLSPDGVHEPFDLSEQTYDFLGEMRK NAV 
JjK VAE FAGTfa' k/ j. GAGL I Q PLHRAPARDHGL jRGGAAPALS VS H 
GN/GKQL/AMSSQGSDDEQIKRENIRSLTMSGHVGFESLPDQLV 
NRSIQQGFCFNILCVGBTGIGKSTLIDTLFNTNFEDYESSHFCP 

nvklkaqtyelqesnvqlkltivntvgfgdqinkeesyq^ivdy 
idaqfeaylqeelkikrslftyhdsrihvclyfisptghslktl 

DLLTMKNLDSKVYIIPVIAKADTVSKTELQKFKIKLMSELVSNG 
VQIYQFPTDDDriAECVWAAMNGQLPFAWGSMDEVKVGKKMVKA 
RQYPWGWQVENEKHCDF^REMLICTNMEDLREQTHTRHYEL 
YRRCKLEEMG FTD VGPENKP VS VQETYEAKRHEFHGBRQRKEEE 
MKOMFVQRVKEKEAILKEAERELQAKFEHLKELHQEERMKLEEK 
RRLLEEEIIAFSKKKATSEIFHSQSFtiATGSMLRKDKDRKMSQF 
FVKQKVPEHRRSSSQANFIKKKLEVCFDFAVICF1TSIFGEQPO 
LIiIFMEKYFQVQGQYISQSE 

AAAAPSPAWwUGVKVVYVVHTCWVMYGIV^TRPCSGDASCIQpy- 

LARRPKLQL \ RHS FTTTRS HLGAENN I DLVLNVEDFD VES KFER 

TVNVS VPKKTRNNGTL YAYI FLHHAGVLPWHDGKQVHLVSPLTT 

YMVPKPEEINLLTGESDTQQIEADKKPTSALDEPVSHWRPRLAL 

HVMADNFVFDGSSbPADVHRYMKMIQLGKTVHYLPILFIDQLSN 

RVKDLMVINRSTTELPLTVSYDKVSLGRLRFWIHMQDAVYSLQQ 

FGFSEKDADEVKGIFVDTNLYFLALTFFVAAFHLLFDFIiAFKND 

ISFWKKKKSMIGMSTKAVLWRCFSTWIFLFLLDEQTSLLVLVP 

AGVGAAIELWKVKKALKMTI FWRGLMPEFQFGTYS ESERKTEE Y 

DTQAMKYLSYLLYPLCVGGAVYSLLNIKYKSWYSWLINSFVNGV 

YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTPIDDVFAF 

IITMPTSHRLACFRDDWFLVYLYQRWIiYPVDKRRVNEFGESYE 
EKATRAPHTD 



S016 



AAAAPSPAWWCGVKVVYVV HTCWVMYGIViTKPCSGDASCIQPY 
IARRPKLQL \RHS FTTTRS HLGAENNI DLVLNVEDFD VES KFER 
TVNVS VP KKTRNNGTL YAY I FLHHAGVLPWHDGKQVHLVSPLTT 
YMVPKP EE INLLTGESDTQQ I EADKKPTSALDEP VSHWRPRLAL 
NVMADNFVFDGSSLPADVHRYMKMIQLGKTVHYLPILFIDQLSN 
R VKD LMVINRS TTELPLTVS YDKVS LGRLRFW I HWQDAVYS LQQ 
FGFSEKDADEVKGIFVDTNLYFLALTFFVAAFHLLFDFLAFKND 
ISFWKKKKSMIOMSTKAVLWRCFSTWIFLFLLDEQTSLLVLVP 
AGVGAAIELWKVKKALKMTIFWRGLMPEFQFGTYSESERKTEEY 
DTOAMKYLS YLL YPLCVGGA VYSLLWIKYKS WYS WL IWSF VWG V 
YAFGFLFMLPQLFVWYECLKSVAHLPWKaFTYKAFNTFIDDVFAF 

i itmptshrlacfrddwflvylyqrwlyp vdkrrvnefges ye 
ekatraphtd 



MGPRLSVWLL.I^PAAL1^HEEHSRAAAKGGCAGSGCGKCDCHGV~~ 
KGQKGERGLPGLQGVIGFPGMQGPEGPQGPPGQKGDTGEPGLPG 
TKGTRGP PGASG YPGNPGLPGI PGQDGPPGPP3IPGCNGTKGER 
GPLGP PGLPGFAGNPGPPGLPGMKGDPGE I LGHVPGMLLKGERG 
FPGIPGTPGPPGLPGLQsPVGPPGFTGPPGPPGPPGPPGEKGQM 
GLS PQGPKGDKGDQGVSGPPGVPGQAQVQEKGDFATKGEKGQKG 
EPGFQGMPGVGEKGEPGKPGPRGKPGKDGDKGEKGSPGFPGEPG 
YPGLIGRQGP\QGEKGEAGPPGPPGIVIGTGPLGEKGERGYPGT 
PGPRGEPGPKGFPGLPGQPGPPGLPVPGQAGAPGFPGERGEKGD 
RGFPGTSLPGPSGRDGLPGPPGS PGPPGQPGYTNGI VEOQPGP P 
GDQGPPGIPGQPGFIGEIGEKGQKGESCLICDIDGYRGPPGPQG 
PPGEIGFPGQFGAKGDRQLPGRDGVAGVPGPQGTPGLIGQPGAK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


~ 1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, DsAspartic Acid, E=. 
Glutamic Acid, F*Phenyl alanine, G^Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N»Asparagine, 
P= Proline, Q^Glutamine, R^Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=poseible nucleotide insertion) 


5795 " 






GEPGEFYFDLULKGDKGDPGFPGQPGMPGRAGSPGRDGHPGLPG 

PKGSPGSVGLKGBRGPPGGVGFPGSRGDTOPPGPPGYGPAGPIG 

DKG QAG FPGGPGS PGL PGPKGE PGKIVPLPGPPGAEGLPGS PG F 

PGPQGDRGFPGTPGR\PGL\PGEKGAVG\QPGIGFPGPPGPKGV 

DGLPGDMGPPGTPGRPGFNGLPGNPGVQGQKGEPGVGLPGLKGB 

PGLPGIPGTPGEKGSIGVPGVPGEHGAIGPPGLQGIRGEPGPPG 

LPGSVGSPGVPGIGPPGARGPPGGQGPPGLSGPPGIKGEKGFPG 

FPGLDMPGPKGDKGAQGLPGITGQSGLPGLPGQQGAPGIPGFPG 

SKGEMGVMGTPGQPGSPGPWGAPGLPGEKGD\HGFPGSSGPRGD 

PSLKGDKGDVGLPGKPGSMDKVYMGSMKGQKGDQGEKGQIGPIG 

EKGSRGDPGTPGVPGKDGQAGQPGQPGPKGDPGISGTPGAPGLP 

GPKGSVGGMGLPGTPGEKGVPGIPGPQGSPGLPGDKGAKGEKGQ 

AGPPG IG I PGLRGEKGD QG I AGFPGS PGEKGE KGS IGI PGM PGS 

PGLKGSPGSVGYPGSPGLPGEKGDKGLPGLDGIPGVKGEAGLPG 

TPGPTGPAGQKGEPGSDGIPGSAGEKGEPGLPGRGFPGFPGAKG 

DKGS KGE VGFPGLAGSPG I PGSKGEQG FMGPPGPQGQPGLPGS P 

GHATEGPKGDRGPQGQPGLPGLPGPMGPPGLPGIDGVKGDKGNP 

GWPGAPGVPGPKGDPGFQGMPGIGGSPGITGSKGDMGPPGVPGF 

QGPKGLPG LQG IKGDQGDQG VPGAKGIiPG PPG P PGP YDI I KGE P 

GLPGPEGPPGLKGLQGLPGPKGQQGVTGLVGIPGPPGIPGFDGA 
PGQKGEMGPAGPTGPRGFPGPPGPDGLPGSMGPPGTPSVDHGFIt 
VTRHSQT IDDPQCPSGTKI LYHGYSLLYVQGNERAHGQDLGTAG 
SCLRKFSTMPFLFCNI NNVCNFASRKD YSYWLSTPEPMPMSMAP 
ITGFWIRPFISRCAVCEAPAMVMAVHSQTIQIPPCPSGWSSLWI 
GYSFVMHTSAGAEGSGQAIASPGSCLEEFRSAPFIECHGRGTCN 
YYANAYSFWIATIERSEMFICKPrPSTLKAGEnRTHVSRCQVCMR 


5796 


1192 


61 


STRSPTVEYlSAHPHIIiFMLLKGYEAPQIALRCGIMLRECIRHE 
PLAKI ILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVL 
VADFLEQNYDTIFEDYEKLLQSSNYVTKRQSLKLLGELILDRHN 
FAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPH 
KTQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKITYLIKQI 
RDLKKTAP +RALRDS KR 


5797 " 


2 


1078 


GR VGWELWCM i ISPPKDWWDAGDPSLPIRTPAMXGCS FWNRKF 
FG E I GLLD PGMDVYGGENI ELG I KVWLCGGSME VL PCSRVAH I E 
RKKKP YNSN I GFYTKRNALRVAEVWMDD YKS HVY IAWNLPLENP 
G IDI GDVSERRALRKS IjKCKNFQ WYLDHVYPEMRR YNNTVAYGE 
LR^KAKDVCUX3GPLEmiTAILYPCHGMGPQLARYTKEGFLHL 
GALGTTTIiLPDTRCLVDNS KS RLPQLLD CDKVXS SL YKRWNFIQ 
NGA IMNKGTGRCLE VENRGLAG IJ>LIURSCTGQRWTI KNS IK*R 

EGAGALEPGPQDMAAPPMIWTSCPGGETARGRQVLDGPPRASPG 
QHRDPG 


5798 


2 


891 


PRVRQKTLVD VTLENSNI KDQ I RNLQQT YEASMDKLREKQRQLE " 

VAQVENQLLKKKVESSQEANAEVMREMTKKLYSQYEEKLQEEQR 

KHSAEKEALLEETNSFLKAIEEANKKMQAAEISLEEKDQRIGEL 

DRLIERMEKERHQLQLQLLEHETEWSGELTDSDKERYQQLEEAS 

ASLRERIRHLNDMVHCQQKKVKQMVEEIESLKKKLQQKQLLILQ 

LLEKISFLEGENNELQSRLDYIiTETQAECTBVETREIGVGCDLLP 

SQTGRTREIVMPSRKYTPYTRVLELTMKKTLT 


5799 


644 


115 


KILGSRWKSMHWQEKQPYYEEQARLSKIHLEKYPNYiCYKPRPKR 
TC I VDGKKLR I GS YKOLMRS RROFMBfi RTTTun nnon tdt TTv-tntn 

WYPGAITMATTTPSPQMTSDCSSTSAfiPEPSLPVIQSTYGMKT 
DGGSIAGNEMINGEDEMEMYDDYEDDPKSDYSSENEAPEAV5AN 




2679 


1435 

- 

] 
] 
J 
I 


LLSTYI KFINLFPETKATIQGVLRAGSQLRNAD VELQQRAVEYii ~ 
PLSSVASTD VLATVLEEMPPFPERES S I IiAKLKRKKGPGAGSAL 
DDGRRDPSSNDINGGMEPTPSTVSTPSPSADItLGIaRAAPPPAAP 
PAS AGAGNLL VD VFDGPAAQPS LGPTPEEAFLS PGPEDIGP P I P 
EADELLNKFVCKNNGVLFEKTQLLQIGVKSEFRQNLGRMYLFYGN 
<TSVQFQMFSPTWHPGDLQTQLAVQTKRVAAQVDGGAQVQQVI> 
TIECLRDFLTPPLLSVRFRYGGAPQALTLKLPVTXNKFFQPTEM 
^QDFFQRWKQLSLPMBAQKrFKAHHPMDAEVTKAKLLGFaSA 
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SEQ 
XD 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
EA= Alanine, C=Cysteine, D=Aspartic Acid", E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucinc, M=Methionine, N^Asparagine, 
P= Proline, Q^Glu t amine , R=Arginine, 
S»Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LLDNVD PNPENF VGAG 1 1 QTKALQVGCLUUjEPNAQAQMy RLT L 
RTSKE P VS RHLCELIiAQQ F 


5800 


2679 


1435 


IiLSTYIKFIWLFPETKATIQGVLRAGSQIjRNADVELQORAVEYI, 
TLSSVASTDVLATVLE3MPPFPERESS I LAKLKRKKG PGAGSAL 
DDGRRDPSSNDINGGMEPTPSTVSTPSPSADLLGLRAAPPPAAP 
PASAGAGNLLVDVFDGPAAQPSLGPTPBEAFLSPGPEDIGPPIP 
EADELLNKFVCKNNGVLFENQLLQIGVKSEFRQNLGRMYIiFYGN 
KTSVQFQNFSPTVVHPGDLQTQLAVO/TKRVAAQVDGGAQVQQVIi 
NIECLRDFLTPPLLSVRFRYGGAPQALTLKLPVTINKFFQPTEM 
AAQDFFQRWKQLSLPQQEAQKIFKANHPMDAEVTKAKLLGFGSA 
LLDNVDPNPENFVGAG I IQTKALQVGCLLRLEPNAQAQMYRLTL 
RTSKEPVSRHLCELLAQQF 


5801 


3 


1413 


FPRLYHLIPDGEITS I KINRVDPSE&L6 IRLVGGSETPLVHI I I 
QHTYRDGVI ARDGRLLPGDI ILKVNGMD I SNVPHNYAVRLLRQP 
CQVkWLTVMREQKFR S RNNGQ APDAYRPRDDS FHVI LNKS S PEE 
QLG rKLVRKVDEPGVFI FNVLDGGVAYRHGQLEENDRVLAINGH 
DLR YGS P ESAAHL I QAS ERR VHI» WSRQ VRQRS PDI FQEAGWNS 
NGSWSPGPGERS^PKPLHPTITCHEKWWIQKDPGESLGMTVA 
GGASHREWDLPI YV1S VE PGGVI SRDGRIKTGDILLN VDGVELT 
E VSRSEAVALLKRTSSS I VLKALEVKEYEPQEDCSS PAALDSNH 
NMAPPSDWSPSWVMWLELPRCLYNCKDIVLRRNTAGSLGFCXVG 
G YEE YNGNKP FFI KS I VEGT PAYNDGR I RCGD ILLAVNGRS TSG 
M2HACLARLLKELKGR I Tl/I I VS WPGTFL 


5802 


3 


250 


CFSLYQIMERIMDL PTLbRHAFREMFSVGGLFWMFRI RI I LCLM 
GAFFYLISPLDFVPEALFGILCFLDDFFVIFLLIjIYISIMYREV 
ITQRLTR 


5803 


2234 


1299 


EAQFGTTAE I YAYREEQDFGIE I VKVKAIGRQRFKVLELRTQSD 
GIQQAKVQILP3CVLPSTMSAVQLESLNKCQIFPSKPVSREDQC 
SYKWWQKYQKRKFHCANLTSWPRWLYSLYDAETLMDRIKKQLRE 
WDENLKDDSLPSNPIDFSYRVAACLPIDDVIiRIQULKIGSAIQR 
LRCE LD I MNKC TSL C CKQCQETE ITTKNE I FSLS LCG PMAAYVN 
PHGYVHETLTVYKACNLNLIGRPSTEHSMFPGYAWTVAQCKICA 
SHIGWKFTATKXDMS PQKFWGLTRSALLPT1 PDTEDE I S PDKV1 
LCXj * 


5804 


2 


1707 


EME KQRQEEQRKRTEEER KRRI EQDM LEKRKI QRELAKRAEQ IE 
DINNTGTESASEEGDDSLLITWPVKSYKTSGKMKKNFEDLEKE 
REEKERIKYEEDKRIRYEEQRPSLKEAKCLSLVMDDE1ESEAKK 
ESLSPGKLKLTFEELERQRQENRKKQAEEEARKRLEEEKRAFEE 
ARRQMVNEDE ENQDTAKI FKG YR PGKLKIiS FEEMERQRREDE KR 
KAEEEARRRIEEEKECAFAEARRNMWDDDSPEMYKTISQEFLTP 
GKLEINFEELLKQKMEEEKRRTEEERKHKLEMEKQEFEQLRQEM 
GEEEEENETFGLSREYEELIKLKRSGSIG^UCKLKSKFEKIGQLS 
EKEIQKKIEEERARRRAIDLEIKEREAENFHEEDDVDVRPARKS 
EAPFTHKVNMKARFEQMAXAREEEEQRRIEEQKLLRMQFEQREI 
DAALQKKREEEEEEEGSIMMGSTAEDEEQTRSGAPWFKKPLKNT 
SWDSE PVRFTVKVTGEPKPEITWWFEGEILQDGEDYQYI ERGB 
TYCLYLPETFPEDGGEYMCKAVNNKGSAASTCILTIESKN 


5805 


3 


776 


Y I S DTLGQVYKS KIRW W I EENGGNGNISVDDL IALLDLAEHASS 
AFKESQQQSBDREYEVKERLYPKSKRRYDTYNIAGYQGEIEVGL 

LKNYI P YLTKLXFSLKKSFDFFDE YFVLLKPRNNI KQNEEAKTR 
RKVAGYFKKYVDIFCLLEE3QNNTGLG9KFSEPLQVERCRRNI*V 
ALKADKFSGLLEYLIKSQEDAISTMKCIVNEYTFLLK 


5806 


12S7 


877 


AVFTFHNHGRTAKLYSLHSWLGITTVFLFACQRFLGFAVFLLPW 
ASMWLRS LLKP I HVFFGAAI LSLS IAS VISG I NE KLFFS IiKNTT 
RPYHS LPSEAVFANSTGMLWAFGLLVLYILLAS SWKRP 


5807 


22S7 


1302 


RFS KKTFRRPMAVDIQPACLGLYCGKTIiLFKNGSTE I YGECGVC 
PRGQRTNAQKYCQPCTESPELYIJWLYLGFMAMLPLVriHWFFIEW 
YSGKKSSSALFQHITALFECSMAAIITLLVSDPVGVDYIRSCRV 
LMLS DWYTML YNPS PD YVTTVHCTHE AVYPLYT I VFI YYAFCIiV 



375 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 



Predicted 

beginning 

nucleotide 

location 

cor re spcndi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 



Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K« Lysine, 
L=Leucine, M=Methionine, N»Asparagine , 
P=Proline, Q=Glut amine, R-Arginine, 
S -Serine, T=Threonine, V=Valine, 
W=Tryptophan / Y«Tyrosine, X=UnTcnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide inserti on) 
LMMLLRPLLVKKIACGLGKSDRFKS I Y AALYFFFILTVLQAVGG 
GLLYYAFP YI ILVLSLVTLAVYMSASE 1 ENCYDLLVRKKRL I VL 
FSHWLLHA YG 1 1 S I £ RVDKLE QDLPLLALVPTPALF YIjFTAKFT 
EPSRILSEGANGH 



433 



464 



SLPDSGWEYLSNGGVADNHKDFGELRYNECLMNFSCNGKNGSS' 

EGRITHGFQLKSAYEJNNLMPYTNYTFDFKGVIDYIFYSKTHMNV 

LGVLGPLDPQWLVEWWITGCPHPHXPSDHFSLLTQLEUIPPLLP 
LVNGVHLPNRR 



2422 



5810 



1641" 



I LVPGFQGI I>HPG VYCALQSQHQAQEL VAD I DSCE VSGLCRKGG 
RCVNTHGSFECYCMDGYLPRNGPEPFHPTTDATSCTEIDCGTPP 
EVPIX5YIIGNYTSSLGSQVRYACREGFFSVPEDTVSSCTGLGTW 
ESPKLHCQEINCGNPPEMRHAILVGNHSSRLGGVARYVCQEGFE 
SPGGKITSVCTEKGTWRESTLTCTEILTKINDVSLFNDTCVRWQ 
INS RRINPKI S YV I S I KGQRLDPMES VREETVNLTTDS RTPE VC 
LALYPGTNYTVNISTAPPRRSMPAVIGFQTAEVDLLEDDGSFNI 
SIFNETCLKLNRRSRKVGSEHMYQFTVLGQRWYLANFSHATSFN 
FTTREQVP VVCLDLYP TTD YTVWVTLLRS P KRHS VQI T I ATP PA 
VKQT1SNX SGFNETCIjR WRS I KTADMBEW YL FHI WGQR W YQKEF 
AQEJ-ITFNISSSSRDPEVCtiDLRPGTNYNVSLRAIiSSELPVVISIi 
TTQITEPPLPEVEFFTVHRGPLPRLRLRKAKEKNGPISSYQVLV 
LPIiALQS TFS CDSEGASS FFSNAS D ADG YVAAELLAKDVP DDAM 
B I P IGDRL YYG E YYNAPLKRGSD YCI I LRI TS E WNKVRRH S CAV 
WAQVKDSSI*MLLQMAGVGLGSLAWIILTFLSFSAV 



1918 



5812 



5204 



851 



2744 



KVFGTHKDHEVSTLDTAISAVKVQLAEFLENLQEKSIiRlEAFVS ' 
BIESFFNTlEENCSKNEKRLEEQNEEMMKKVIiAQYDEKAQSFEE 
VKKKKMEFLHEQMVHFLQSMDTAKDTLETIVREAEELDEAVFLT 
SFEEINERLLSAMESTASLEKMPAAFSLFEHYDDSSARSDQMLK 
QVAVPQPPRLEPQEPNSATSTT1AVYWSMWKEDVIDSFQVYCME 
E PQDDQE VNELVEE YRIjTVKES YCI FEDLEPDRCYQVWVMAVNF 
TGCSLPSERAIFRTAPSTPVIRAEDCTVCWKTATIRWRPTTPEA 
TETYTLE YCRQHS PEGEGLRS FSGI KGLQLKVNLQPNDNYF FYV 
RAINAFGTSEQSEAALISTRGTRFLLLRETAHPALHISSSGTV1 
S FGERRRLTE I PS VLGEELP S CGQHYWETTVTDCPAYRLGICSS 
SAVQAGALGQGETSWYMHCSEPQRYTFFYSGIVSDVHVTERPAR 
VGIX»LDYNNQRLIFINAESEQI)LFIIRHRFKEGVHPAFALEKPG 
KCTLHLGIEPPDSVRHK 



AAALADPIiPEDKWSAEKRRPLKSSLGYE ITFSLLNPDPKSHDVY 
WDIEGAVRRYVQPFLNALGAAGNFSVDSQILYYAMLGVNPRFDS 
ASSSYYLDMHSLPHVINPVBSRLGSSAASLYPVLNFLLYVPELA 
HSPLYI QDKDGAPVATNAFHSPRWGGIMVYNVDS KTYNASVLPV 
RVEVDMVRVMEVFLAQLRLLFGIAQPQLPPKCLLSGPTSEGLMT 
WE L DRLLWARS VENLATATTTLTS LAQLLGK I SN I VI KDD VAS E 
VYKAVAAVQKSAEELASGHIiASAFVASQEAVTSSELAPFDPSLL 
HLLYFPDDQKFAIYIPIiFLPMAVPILLSLVKIFLETRKSWRKPE 



GGRQRCQRGRSCGAREBEVEPGTARPPPAASAMDASLEKIADPT" 

IiAEMGKNLKEAVKMLEDSQRRTEEENGiaaJSGDIPGPLQGSGQ 

DMVSILQLVQNLMHGDEDEEPQSPRIQNIGEQGHCOALLGHSLGA 

YISTLDKEKXRKLTTRILSDTrLWLCRIFRYENGCAYFHEEERE 

GLAKICRIiAIHSRYEDFVVDGFNVLYNKKPVIYLSAAARPGLGQ 

YLCNQLGL P FPCLCR VP CNTVFGSQHQMDVAFLE KL I KDD I ERG 

RLPLLLVANAGTAAVGHTDKIGRLKELCEQYGIWLHVEGVNLAr 

LALGYVSSSVIAAAKCDSMTMTPGPWLGLPAVPAVTLYKHDDPA 

LTLVAGLTSNKPTDKIiRALP LWLSLQ YLGLDGF VERI KHACQLS 

QRLQESLKKVNYIKILVEDELSSPVVVFRFFQELPGSDPVFKAV 

PVPKMTPSGVGRERHS CDALNRKLGBQL KQLVPASGLTVMDLEA 

EGTCLRFSPLMTAAVLGTRGEDVDQLVACIESKLPVLCCTLQLR 

EEFKQE VEATAGLL Y VDDPNWSGIGWRYEHANDDKS S L KS YPQ 

GENIHAGLLKKLKELESDLTFKIGPEYKSMKSCLYVGMASDNVH 

AAELVETIAATAREIEDNSRLLENMTEWRKGIQEAQVELQKAS 

EERLLEEG VLRQ I PWGS VTiNWFSPVQALQtOGRTFNt»TAGSIiES 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=: Phenyl alanine, Gr=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
L=Leueine, M=Methioni»e, N=Asparagine, 
P=Proline, Q^Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W»Tryptophan, Y=Tyrosine, X Unknown, *=:Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 










TEPIYVyKAQGAGVTLPPTPSGSRTKQRLPGQKPFKRSLRG^DA - 

LSKTSSVSHIBDLEKVERLSSGPEQITLEASSTEGHPGAPSPQH 

TDQTEAFQKGVPHPEDDHSQVEGPESLR 




5813 


293G 


699 


HRDGVSGSLERPLTDRSRTGAFAQQRGKMATAGGGSGADPGSRG 
LLRLLSFCVLLAGLCRGNS VERKI YI PLNKTAPCVRLOJATFQ I 
GCQSSISGDTGVIHWEKEEDLQWVLTDGPNPPYMVLLESKHFT 
RDLMEKLKGRTSRIAGIAVSIiTKPSPASGFSPSVQCPNDGFGVY 
SNSYGPEFAHCREIQWNSLGNGIAYEDFSFPIFLLEDENETKVI 
KQ C YQDHNLSQNGSAPTF P LCAMQLFSHMAWL S FSTAT \ CMRR S 
S IQS TFS INPKI VCDP LSDYNVWSMLKP INTTGTLKPDDRVWA 
ATRLDSRS FFWNV\APGAES AVAS FVTQLAAAEALQKAPDVTTL 
PRNVMFVFFQGETFDYIGSSRMVYDMEKGKFPVQLENVDSFVEL 
GQVALRTSLELWMHTDPVSQKNESVRNQVEDLLATLEKSGAGVP 
AVILRRPNQSQPLPPSSLQRFLRARNISGWLADHSGAFHNKYY 
QSIYDTAENINVSYPEWLEPLKE/ETWNFG+QDTAKALADVATV 
LGRALYELAGGTNFS DTVQADP QT VTRLLYG \ FLIKANNS WFQS 
ILC^RDLRSYLG*RGI»FQH\YIAV\SSPTNTIYV/VLQYALANL 
TGTWNLTREQCQDPSKVPSENKDLYEYSWVQGPU-1SNETDRLP 
R C VRSTARLARALSPAFE LS QWS S TE YSTWTES RWKD I RAR I FL 

IASKELELITLTVGFGILIFSLIVTYCINAKAjDVIjFIAPREPGA 
VSY 




5814 


8500 


432 

: 

3 
I 


ALKCRPRRVLAILVGPVQPDRMAEEGAVAVCVRVRPZ^SRteESIi 

getaqvywkthnnvi ypvdgs ksfnfdrvlhgnetpknvyea\ i 
aapiidsaiqgyngtifaXygqtNasgktytmmgsedhlgvipq 

GQFHGHFSQKI*EVFLDREFLLRVSYMEIYKBTITDLIiCGTOKM 
KPLIIREDVNRNVYVADLTEEWYTSEMALKWITKGEKSRHYGE 
TKMNQRSSRSHTIFRMILESREKGEPSNCEGSVKYSHLNLVDLA 
GSERAAQTGAAG VRLKEGCNTI NRSLF IliGQVI KKLSDGQVGG FX 
NYRDS KLTR ILQNSLGGNPKTRI I CTITP VSFDETJjTALQFAST 
AKYMKNTPYVUEVSTDEALIiKRYRKEIMDbKKQJbEEVSLETRAQ 
AMEKDQIiAQLLEEKDLlLQKVQNEKlENLTRMLVTSSSLTLQQEL 
KAKRKRRVTWCIiGKINKMKNSNYADQFNI PTNITTKTHKLSINL 
LRBIDESVCSESDVFSNTLDTLSEIEWNPATKLLNQENIESELM 
SU^YDNLVLDYEQLRTEKEEMELKLKEKNDLDEFEALERKTK 
KDQBMQLIHE ISNLKNLVKHREVYNQDLENELSS KVELLRE KED 
QIKKLQEYIDSQKIjENIKMDLSYSLESIEDPKQMKQTLFDAETV 
AIJ3AKRE S AFLRS ENIJELKEKMKELATTYKQMEND IQLYQS QliE 
AKKKMQVDLEKELQSAFNEITKLTSLIDGKVPKDLLCNLELEGK 
ITDLQKELNKEVEEWEALREEVILLSELKSLPSBVERLRKEIQD 
KSEELH 1 1 TSEKDKL FS EWHKES R VQGLLE E IGKT KDDLATTQ 

SNYKSTDQEFQNFKTLHMDFEQKYKMVLEENERMNQEIVNLSKE 

AQKFDSSLGALKTELSYKTQELQEKTREVQERLNEMEQLKEQLE 

KRDSPLQTVEREKTLITEKLQQTLEEVKTLTQEKDDLKQLQESL 

QIERDQLKSDIHDTVNMNIDTQEQLRNALESLKQHQETrNTI»KS 

KISEEVSRNLHMEENTGETKDEFQQKMVGIDKKQDLEAKNTQTL 

TADVKDNEIIEQQRKIFSLIQEKWELGQMLESVIAEJCEQLKTDL 

KENIEMTIENQEELRLLGDELKKQQEIVAQEKNHAIKKEGELSR 

TCDRLABVEEKLKBKSQQLQEKQQQLLNVQEEMSEMQKKINEIE 

NLKNELKNKELTLEHMETBRLELAQKLNENYEEVKS ITKERKVL 

KELQKS FETERDHLRG Y I RE I EATGLQTKEEL KIAH IHLKEHQE 

TI DELRRS VS EKTAQ I INTQDLEKSHTKLQEE I PVLHE EQELLP 

NVKKVSETQETMNEIiELLrEQSTTKDSTTLARXEMERLRLNEKF 

QESQEEIKSLTKERDNLKTIKBALEVKHDQLKEHIRETLAKIQE 

SQSKQEQSLNMKEKDNETTKIVSEMEQFKPKDSALLRIEIEMLG 

LSKRLQESHDEMKS VAKEKDDLQRLQEVLQS ESDQLKENI KE I V 

kKHLETE EE LKVAHCCLKEQEET I WELR VNLS E KETE I S T IQKQ 

LEAINDKLQlffKlQEIYEKEEQLNIKQISEVQEKVNELKQFKEKR 

KAKDSALQS I ESKMLELTNRLQESQEEIQ IMI KEKEEMKRVQEA 

[jQIERDQLKENTKEIVAKMKESQEKEYQFLKMTAVNETQEKMCE 

rEHLKEQFETQKLNLENIETENIRLTQILHENLEEMRSVTKERD 

DIjaSVEETLKVERIXJLKENLRETITRDLEKQEELKIVHMHLKEH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cy©teine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=*Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
LaLeucine, M=Meth.i'onine / N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T«Threonine, V^Valine, 
^-Tryptophan, Y»Tyroaine, X=Unknown, *=Stop 
Codon, /-^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QETIDKLRGIVSEKTNEISNMQKDLEHSWDALKAQDLKIQEELR 
I AHMHLKEQQET I DKLRG I VS E KTDKLS WMQKDLEWSNAKLCJE K 
IQELKANEHQLITLKKDVNETQKKVSEMEQLKKQI KDQSLTLS K 
LEIENLETLAQKLHENL3EMKSVMKERDNLRRVEETLKLERDQLK 
ESLQETKARDUEIQQELKTARMLSKEKKETVDKLREKISEKT1Q 
ISDIQKDLDKSKDELQKKIQELQKKELQLLRVKEDVNMSHKKIN 
EMEQLKKQFEPNYLCKCEMDNFQLTKKLHESLEEIRIVAKERDE 
LRR IKE SI»KMERDQF I ATLRE MI ARDRQNHQ VKP E KRLLSDGQQ 
HLMESLREKCSRI KELLKRYSEMDDHYE CLNRLSLDLEKEI EFH 
RIMKKLKYVLSYVTKIKEEQHECINKFEMDFIDEVEKQKELLIK 
IQHLQQDCDVPSRELRDLKLNQNMDLHIEEILKDFSESEFPSIK 
TEFQQVLSNRKEMTQFLEEWUfTRFDIEKLKNGIQKENDRICQV 
NNF FNNR 1 I AI MNESTE FEERS ATI S KE WEQDLKSLKE KNEKLF 
KNYQTLKTSLASGAQVNPTTQDNKNPHVTSRATQLTTEKrRELE 
NSLHEAKESAMHKESK1IKMQKELEVTNDIIAKLQAKVHESNKC 
LEKTKETI QVLQDKVALGAKP YKEE I EDLKMKLGKIDLEKMKNA 
KEFEKEISATKATVEYQKEVIRLLRENLRRSQQAQDTSVISEHT 
DPQ PSNKP tiTCGGGSG I VQNTKALI LKS EH I RLEKEI S KLKQQN 
EQLXKQKNELLSWNQHLSNEVKT57KERTLKREAHKQVTCENSPK 
SPKVTGTAS KKKQITP SQCKERNLQDP VP KE S PKSCFFDSRSKS 

LPSPHPVRYFDNSSLGLCPEVQNAGAESVDSQP\GPWARLFQGK 
DVP\ECKTQ 


53X5 


23 


1460 


SELVMWTVQNRESrjGLLSFPVMITMVCCAHSTNEPSNMSYVKET 
VDRUjKGYDIRLRPDFGGPPVDVGMRIDVASIDMVSEVNMDYTL 
TMYFQQSWKDKRIjS YSGI PLNLTLDNRVADQLWVPDTYFLNDKK 
SFVHGVTVKNRMIRLHPDGTVLYGLRITTTAACMMDLRRYPLDE 
QNCTLEIESYGYTTDDIEFYWNGGEGAVTGVNKIEIiPQFSIVDY 
KMVSKKVEFTTGAYPRLSLSFRLKRNIGYFILQTYMPSTLITIL 
SWVS FWINYDASAARVALG I TTVLTKTTI S THLRETL PKI P YVK 
AIDIYLMGCFVFVFLALLEYAFVNYIFFGKGPQKKGASKQDQSA 
NEKNKLEMNKVQ VDAHGN I LLSTLE I RNETSGSE VLTS VSD P KA 
TMYSYDSASIQYRKPLSSRE\A*GRAPDRHGVPSKGRIRRRAS\ 
QLKVKI PDLTDVNS IDKWSRMFFP I TFSLFNWYWLYYVH 


5816 


861 


191 


TSSRSRAAAQE GDAETPGS VERRGRRAGAEDGMS QAPGAQ PS PP "~ 
TVYHERQRLELCAVHALNNVLQQQLFSQEAADBICKR1APDSRL 
NPHRSLLGTGNYD VNVI MAAIjQGLGLAAVWWDRRR PLS QLAL ?Q 
VI^LII^PSPVSLGDLSLPLRRRHLRWPCARL/VTVSYYWLDS 

K\LRAPEGPGGLRTE\*GPFLAAAIiAQGLCEVLLWTKEVEEKG 
SWLRTD 


5817 


851 


118 


RLFRGPGANRGRSCRGCSGGREPSGGALPKRHCPC*PPSPPAAD 
VMSNTTVPNAP QANSDSM VG Y VLGP FFLI TLVGVVVAVVMYVQK 
KKRVDRLRHHLLPMYSYDPAEELHEAEQEIjLSDMGDPKW\QAG 
RVATSTSGCHCWMSRRDLTPLPHPSEPGVIiDCLGPCHLLPLLSP 
GSPCStfVLGLHFSLHPPSAASASHAIiTITSLPPGLLPFVGVELTA 
HPQALMGRGFPSGMAAAGRHLCFL 


5818 


3 

i 


3918 

: 

i 
] 


QALRDKL W IFli VQS FYA VRHTES WKIiMSTDDQ£KIQAAAFDKGD 
DRRLGKKPIFSSSQQRKQVSDSGDIKIKSMRGNNTKKECWSYLST 
NKKMKS DGLGAS GHS S STNRWS INKTLKQDDVKE KDGTKI AS K I 
TKELKTGGKNVSGKPKTVTKSKTENGDKARLEP3MSPRQVVERSA 
TAAAAATGQ KNLLKGKG VRNQEGQ 1 SGARP KVLTGNLNVQAKAK 
rjw\fti^iUJ^FCii5IAGPSSRSTDSSMEFSISTECI)DEPKENGS 
TEEEKPSGHKLS FCDSPGQMMKNS VDS VKWSTVAI KSRPVSRVT 
NGTSNK KS I HEQDTNVNNSVL KKVSGKGCS EP VP QA I LKKRGTS 
NGCTAAQQRTKSTPSNLTKTQGSQGESPNSVKSSVSSRQSDENV 
AKLDHNTTTEKQAP KRKMVKQ VHTALP KVNAKTVAMPKNIjNQS K 
KGE ITiNNKDS KQ KM PPGQ VIS KTQ PS SQRPLKHETST VQKSM FH 
DVRDNNNKDSVSEQKPHECPLINliASEISDAEAIiQSSCRP\DPQK 
PLNDQEKE KIiALECQNI S KLDKS LKHELES KQ I CLDKS ETKFPN 
HKETDD CDAANI CCHS VGSDNVNS KF YS TTALKYMVSNPNENS I» 
tfSNPVCDLDS TSAGQI HL I SDRENQ VGRKDTKf XQSS I KC VEDVS 
jCNPERTNGTLNSAQEDKKSKVPVEGLTI PS KLSDESAMDEDKH f 
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SEQ 
ID 
NO: 


Predicted 
beginning; 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing sxgnai peptide" 
(A^Alanine, C=Cysteine, D^Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N^Asparagine, 
P-Proline, Q=Glutamine, R:=Arginine, 
S=Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ATADSDVS S KCFSGQLS EKNS PKNMETSE S PESHET PET P FVGH 
WNLSTGVLHQRESPESDTGSATTSSDDIKPRSEDYDAGGSQDDD 
G SNDRG IS KCGTML CHDFLGRSS S DTSTP EELKI YDSNLR I E VK 
MKKQSSNDLFQVNSTSDDE IPRKRPRIW^l? Ttrwe-p tro wmt do 
GSVQFAQEIDQVSSSADETEDETCSEAENVAENFSISNPAPQQFQ 
G I INLAFEDATENECRE FSAN KKFKRSVLLS VDECEE LGSDEGE 
VHTPFQASVDSFSPSDVFDGISHEHHGRTCYSRFSRESEDNILE 
CKQNKGNSVCKNESTVLDUSS IDS SRXNKQSVSATEKKNT IDVL 
SSRSRQLLREDKKVNNGSNVENDIQQRSKFiDSDVKSQERPCHIi 
DLHQREPNS DIP KNS S T KSLDS FRS QVLPQEGP VKESHS TTTE K 
ANIALSAGDIDDCBTLAQTRMYDHRPSKTLSPIYEMDVIEAFEQ 
KVE S3 THVTDMDF* DDQHFAKQDWTLLKQLLSEQDSNLD VTNS V 
PEDLS LAQ YL INQTLLLARDS S KPQ G ITH I DTLNRWS EL TSPLD 
SS AS I TMASFSS EDCS P QGE WT I LELETQH 


5819 


1 


5557 


AAAGLU3ALHLVMTLWAAARAEKEAFVQSES 1 1 E VLRFDDGGL 
LQTETTLGLSSYQQKS I S L YRGNCR ? I R F E PPMLDFHEQ PVQM P 
KME KVYLHN PSS E * TI TLVS I FATTS HFHAS FFQNRKIL PGGNT 
SFDVS/VFIiARWGNVEOTLFINTSNHC3VFTY\QVFGVGVPNPY 
RLRPFLGARVTVNSSFSPIINIHWPHSEPLQVVEMYSSGGDLHL 
EltPTGQQGGTRKLWEIPPYETKGVMRASFSSREADNHTAFIRIK 
TNASDSTEFI I LP VEVE VTTAPGI YSSTEMLDFGTLRTQDLPKV 
LNLHLLNSGTKDVPITSVRPTPQ\NDAITVHFKP ITLKAS \ESK 
YTKVAS IS FDAS KAKKPSQFSGKI TVKAKEKS YS KLE I PYQAEV 

ldgylgfdhaatlfhi rds padpverp i yltntfsfailihdvl 
lpeeaktmfkvhnfskpvlilpnesgyiftllfmpstssmhidn 
n i i*l i t2jas kfhlpvrvytgfldy fvl p pki eer f idfg vlsat 
easnilfaiinsnpielaikswhiigdgXlsjelvavdrgnrtt 
1 1 sslpecekssssdqssvtlasgyf \avfrvkltakkl\egih 
dgaiqittdyeiltipvk\aviavgsltcspkhwlppsfpgki 
vhqslnimnsfsqkvkiqqirslsedvrfyykrlrgnkedlepg 
kkskianiyfdpglqcgdhcyvglpflsksepkvqpgvamqedm 

WDADWDIiHQSLFKGWTGIKENSGHRLSAIFEVNTDLQKNIISKI 

taelswpsilssprhlkfpltntncss\eeeitlenp/sqdvpv 
wqfiplalysnpsvfvdklvsrfklskvakidlrtlefqvfrn 

SAHPLQS STGFMEG\LS PHXIIiNL 1 LKPGEKKSVKVK\ FTP VHN 

RTVSSLIIVRNKLTVMDAVMVQGQGTTENLRVAGKLPGPGSSLR 

FKITEALLKDCTDSLKLREPNFTLKRTFKVENTGQLQIHIETIS 

I SG YS CEG YGFKWNCQEFTLSANASRD 1 1 ILFTPDFTASRVI R 

ELKF I TTSGSE FVF I LMASLP YHMLATCAEALPR PNWEIAL Y 1 1 

I SG I MS ALFLLVI GTA\ YLEAQGI WE P \ FRRR LS \ FEASNP PFD 

VGRP FDLRRI VG I S S EGNLNTLS CD PGHS RGFCGAGGS SS R PSA 

GSHKQ*GPSGHPHSSHSNRNSADVDDVRAYNSGRTSSMTSAQAA 

S SQPAWTKTRPLVLDSNTGAQGHS AGRKS KGAKQS QHGS QHHAHS 

PLEQHPQPPLPPPVPQPQEPQPERLSPAPLAHPSHPERASSARH 

SSEDSDITSLIEAMDKDFDHHDSPALEVFTEQPPSPLPKSKGKG 

KPLCRKVKPPKKQEEKEKKGKGKPQEDELKDSLADDDSSSTTTB 

TSNPDTEPLLKEDTEKQKGKQAMPEKKESEMSQVKQKSKKLLNI 

KKEIPTDVKPSSLELPYTPPLESKQRRNLPSKIPLPTAMTSGSK . 

SRNAQKTKGTSKLVDNRPPAIAKFLPKSQELGNTS SS EGEKDSP 

PPEWDSVPVHKPGSSTDSLYKLSLQTLKADIFUCQRQTSPTPAS 

PSPPAAPCPFVARGSYSSIVWSSSSSDPKIKQPNGSKHKLTKAA 

SLPGKNGNPTFAAVTAGYDKSPGGNGFAKVSSNKTGFSSSLGIS 

HAPVDSDGSDSSGLWS PVSNPSS PDFTPLNS FSAFGNS FNLTGE 

VFSKLGLSRSCNQASQRSWNEFNSGPSYLWESPATDPSPSWPAS 

SG S PTHTATS VLGNTS GLWSTTP FSSS I WS SNLS SALPFT TPAN 

TLASIGLKGTENSPAPHAPSTSSPADDLGQTYNPWRIWSPTIGR 

RSSDPWSNSHFPHEN 


5820 


310 


1270 


RVS LSGPVSltGVLLCARS STMGKRDNR VAYMMP I AMARS RGP I Q 
SSGPTIQ\VI*IDQGLPGKK*KSN*KRKRK/DSKALAEFEEKMN 
ENWKKELEKHREKLLSGSESSSKKRQRKKKEKKKSW*\DSSSS\ 
SSSSDSSSSSSDSEDEDKKQGKRRKKKKNRSHKSSESSMSETBS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
curie s u unci mc| 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acia segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
h^Histadine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P*=Proline, Q=Glutamine, R=Arginine, 
S=Serine / T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DSKDSLKKkKKSKDGTBKEKDIKGLSKKRKMYSEDKPLSSESLS 
ESEYIBBVRAKKKKSSEEREKATEKTKKKKKHKKHSKKKKKKAA 
SSSPDSP*H*EKSGFPYKESAMSEEISTVKTTTYLLKCMNFLVF 
GIIPGLFSSHSDATV 


5B21 


179 


915 


KWRNQSWRWPKPGTNWMLSCSVOmRVTWTGSVWMRKLGKHPQT 
PT / 1 KD CS I AATGKR PS AR FPHQRRKKRR EMDDGLAEGG PQRSN 
TYVI KLFDRS VDLAQ FSENTPL YP I CRAWMHJNS PSVRERECSPS 
SPLPPXiPEDEEG\SEVTNSKSR*CVQACPPTHTPGGQPKNACR\ 
SR I P S PLAALRMQGT P * RWS PFE PE F S PSTL I YRNMQRWKR IRQ 
RWKEAS HRNQLR YSESMKILREM YERQ 


5822 


454 


4375 


QTL KEM P I VMARDLEETASS S EDKE V I SQE DHP C I MWTGGCRRI 
PVLVFHADAILTKDNNIRVIGERYHLSYKIVRTDSRLVRSILTA 
HG FHE VHPSSTD YNLMW TGS HLKP FLLRTLS EAQKVNHF PR S YE 
LTR KDRLYKN 1 1 RMQHTHG FKAFK 1LP QTFLLP AE YAE FCNS YS 

fcdrgpwivkpvassrgrgWyliknpkqisleenilvsryinnp 

LLIDDFXFDVRLYVLVTSYDPIjVIYLYEEGIARFATVRYDQGAK 
N1RNQFMHLTNYSVNKKSGDYVSCDDP3VEDYGNKMSMSAMLRY 
LKQEGRDTTALMAHVEDLIIKTIISAEIAIATACKTFVPHRSSC 
FELYGFDVL I DSTLKPWLLE VNLS P S LACDAP LDLKI KASMISD 
MFTWGFVCQDPAQRAS TRP I YPTFESSRRNPFQKPQRCRPLSA 
SDABMKMLVGSAREKGPGKLGGSVLGLSMEEIKVLRRVKEENDR 
RGGFIRIFPTSETWEIYGSYLEHKTSMNYMLATRLFQDRMTADG 
APELKI *S1^SKAKLHAALYERKLLSLEVRKRRRRSSRLRAMRP 
KYP VI TQPAEMNVKTETESE EEEE VAIJDNEDE EQ EAS QEE S AGF 
LRBNQAKYTPS LTALVENTP KENSMKVREWNKKGGHCCKLETQE 
L E PK FNLMQ I LQDNGNLS KMQAR XAFSAYLQHVQ I \RLMKDS GG 
QTFSASWAAKEDEQMELWRFLKRASNNLQHSLRMVLPSRRLA1, 
LERTRIIAHQLGDFIIVYNKETEQMAEKKSKICKVEEEEEDGVNM 
ENFQEFI RQASEAELEEVLTFYTQKWKSAS VFLGTKSKISKNNN 
NYSD SGAKGDHPET IMEE VK I KPPKQQQ TTEIHSD KL SRFTTSA 
EKEAKLVYSNSSSGPTATI^KIPNTHLSSVTTSDIiSPGPCHHSS 
LSQIPSAIPSMPHQPTILLNTVSASASPCLHPGAQNIPSPTGI.P 
RCRSGSHT I GP FS S FQS AAH I YSQKIiSRPS S AKAGS C YLNKHHS 
GIAKTQKEGEDASLYSKRYNQSMVTAELQRLAEKQAARQYSPSS 
HrNLLTQQVTNLWLATGI INRSSASAP PTLRPI IS P SGPTWSTQ 
SDPQAPENHSSSPGSRSLQTGGFAWEGEVENNVYSQATGWPQH 
KYHPTAGS YQLQ FALQQLEQQKLQSRQLLDQSRARHQAI FGSQT 
LPNSNLWTMNNGAGCR I SS ATASGQKPTTLPQKWP P PSSCASL 
VP KPP PNHEQVLRRATSQKAS KGS SAEGQLNGLQSSLNPAAFVP 
ITS STDPAHTKIMNHKHTEKQPVHHSWVHD 


5823 


42 


2293 


LLTALSMEGGGGRDEPSACRAGDVNMDD P KKEDI LLLADEKFDF ' 
£>LS LSS S SANEDDE VFFGPFGHKERCIAAS LELNN PVPEQP PLP 
TSESP FAWS PLAGE KF VE VYKEAHLLALH I ES SS RNQAAQAAKP 
ED PRSQGVERFI QE S KF\ KI N LFE KEKEMKKS PTS LKRETYYL S 
DSPLLGPPVGEPRLLASS PALPS SGAQARLTRAPGPPHSAHALP 
RESCTAHAAS QAATQRKPGTKLLL PRAAS VRGRG I PGAAEKPKK 
E I PASPSRTKI PAE KESHRDVL PDPCPAPGAVKVPAAGSHIiGQG K 
RAI P VP \ NKLG LKKTLLKAPGS YSN\ LQRKSS S GA\ VWSGASS A 
CTPQP VAKAKSSEFASI PAN * LPGLCPNI SKS \GRMGPAMLRPA 
L\ PAGPVG \ ASS WQAKRVDVS ELAAEQLTAPP \ SAS PTQPQTPE 
GGG\QWLNSSCAWSE ( 5 c !riT.HtfTPQTDDi3 , nar»T \roifT»tnnwr\»T>r»rr.M 
QFKI PKFS IGDS \ PDSSTPKLSRAQRPQS CTSVGRVT VHSTP VR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL\CVPARRRSSEPRKNSAMRTEPTRESNRKTDSR\LVDVSPDR 
GSPPSRVPQALNFS PEESDSTFS KSTATEVAREEAKPGGDAAPS 
E ALLVDI KLE PLAVTPDAASQPL I DLPL I DFCDTPEAHVAVGS E 
SRPLIDLMTNTPDMNKNVAKPSPWGQLIDLSS PI.IQI.S PEADK 
ENVDSPLLKF 


5824 

- 


42 


2293 ] 
] 

r 


LLTALSMEGGGGRDEPSACRAGDVNMDDPKKEDILLIiADEKFDF " 
3LSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
USES P FAWS PLxAGEKFVE VYKEAH LLAL.-H I ES S SRNQAAQ AAK P 
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SEQ 
ID 
NO; 


Predicted ~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 1 
<A-Alanine, C=Cysteine, D=Aspartic Acid, 
Glutamic Acid, F* Phenylalanine, G=Glycine 
H=*Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M~Methionine, N=Asparagine 
P=Proline, Q«Glutaraine, R-Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5825 






EDPRSQGVEUKIQESKF\KIMLFEKEKSMKKSPTSLKRETYYLS " 

DSPLLGPPVGEPJRLLASSPALPSSGAQARLTRAPGPPHSAHALP 

RESCTAHAASQAATQRKPGTKLLtiPRAASVRGRGI PGAAEKPKK 

EIPASPSRTKIPAEKESHRDVLPDKPAPGAVNVPAAGSHLGQGK 

RAIPVPXNKLGLKKTLLKAPGSVSNXLQRKSSSGAWWSGASSA 

CTPQPVAKAKSSEFASIPAN*LPGLCPNISKS\GRMGPAMLRPA 

L \ PAGF VG \ ASS WQAKR VDVS E LAAEQLTAP P \SAS PTQPQTPE 

6GG \ QWLNS S CAW5ES SQLNKTRS IRRRDS CLNS KTKVMP TPTN 

QFKIPKFSIGDS\PDSSTPKLSRAQRPQSCTSVGRVTVHSTPVR 

RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 

PL\CVPARRRSSEPRKNSAMRTEPTRESWRKTDSR\LVDVSPDR 

GSPPSRVPQALNFSPEESDSTFSKSTATEVAREEAKPGGDAAPS 

EALLVDI KLEPLAVTPDAASQPIi 1 DLPL IDFCDTPEAHVAVGSE 

SRPLIDDMTNTPDMNKNVAKPSPWGQLIDLSSPLIQLSPEADK 
ENVDSPLLKF 


5826 


2 


4210 


b'LQ I ES AS PAP S GFLAAH PHS PGGS LATKGRSRLSAPGMLHL 
S AAP PAP P PE VTATARP CL*CS VGRRGDGGKMAAAGA1>ERSF VEL 
SGAERERPRHFREFT VCS IGTANAVAGAVKYSESAGGF YYVESG 
KLFSVTRNRPIHWKTSGDTLELMEESLDIWI^NNAIRLKFQNCS 
VLPGGVYVSETQNRVIILMLXNQTVHRLLLPHPSRMYRSELWD 
SQMQSIFTDIGKVDFTDPCWYQLIPAVPGISPKTSTASTAWLSSD 
GEALFALPCASGGIFVLKIiPPyDlPGMVSWELKQSSVMQRLLT 
GWMPTAIRGDQSPSDRPLSLAVHCVEHDAFIFALCQDHKLRMWS 
YKEQMCLMVADMLEYVPVKKDLRLTAGTGHIQLiRLAYfiPTMGLYIi 
GIF\MHAPKRGQFClFQLVSTESNRYSLDHISSIiFTSQETLlDF 
ALTS TD I V3AL»WH DAENQ T WKY I NFE HNVAGQ WNP VF MQ PLPEE 

EIVIRDDQDPREMYLQSLFTPGQFTNEALCKALQIFCRGTERNL 
DLSV7SELKKEVTIiAVEKELQGSVTEYEFSQEEFRNLQQEFWCKF 
YACCLQYQEALSHPIAIJU^PHTNMVCIULKKGYLSFLIPSSLVD 
HLYLLPYENLI,TEDETTISDDVDIARDVICLIKCLRLIEESVTV 
DMS V I MEMS C YNLQS PEKAAEQ I LEDMIT I DVENVMED I CS KLQ 

EIRNPIHAIGLLIREMDYETEVEMEKGFNPAQPLNIRMNLTQLY 
GSNTAG YI VCRGVHKI AS TRFLICRDLL ILQQLLMRL.GDAV I WG 
TCQLFQAQQDLLHRTAPLLLSYYLIKWGSECLATDVPLDTLESN 
LQHLS VIjELTOSGALMANRFVSS P QT I VELFFQEVARKH IIS HL 
FSQPKAPLSQTGLNWPEMITAITSYLLQLLWPSNPGCLFLECLM 
GNCQYVQLQD YI QLLHPWCQ VNVGS CRFMLGRC YLVTGEGQKAL 
ECFCQAASEVGKEEFLDRLIRSEDGEIVSTPRLQYYDKVIALLD 
VIGLPELVIQLATSAITEASDDW\KSQATL\RTCIFKHHL\DLG 
\ HNS QAYGSL * PQ I PDS SRQ LDCLRQLVWLCERSQLQDLVE FS 
YVNLHNE WG I IE SRARAVDLMTHNY YELIjYAFHI YRHN YR KAG 
T VMFE YGMRLGRE VRTLRGi»EKQGNC YI»AALNCLRLI R PE YAWI 

vqpvsgavydrpgaspkrnhdgectaaptnrqieileledleke 
cslar i riitlaqhd ?s avavags ss aeemvtllvqagl fdtai s 

LCQT FKL P LTP VFEGLA FKC I KLQ FGGE AAQ AEAW AWLAANQ LS 

svittkessatdeawrllstylerykvqnnlyhhcvinkllshg 

VP JjPNWL INS YKK VPAAELLRL YIiWYDLLDLTP YQ VIR XCGC 




5827 


3 


871 


k.sqllrdhsapppkpctsvgamgc*prq/spkeqqrqlkkqknr 

AAAQRSRQKHTDKADAI^QQHESLEKDNLAIiRKEIQSLQAELAW 
WSRTLHVHERLCPMDCASCSAPGIiLGCMDQAEGLLGPGPQGQHG * 
(.KCW^aJjrQXPGSCYPAQPIjS PGPQPHDS PSLLQCPLPSLSLGP 
AWAEPPVQLSPSPLLFASHTGSSI.QGSSSKLSALQPSLTAQTA 
PPQPLELEHPTRGKLGSSPDNPSSALGLARLQSREHKPALSAAT 
WQGLWD PS PHPLLAFPLLS SAQVHF 






194 


2287 ( 
] 
1 
i 
I 
5 
1 


^MGSENSALXSYTLREPPFTLPSGLAVYPAVLQDGKFASVFVYK 
^ENEDKVNKAAKVP**HLKTLRHPCLLRFLSCTVEADGIHLVTE 
WQPLEVALETIjSS AE VCAG I YD I LLAL I FLHDRGHLTHNNVCL 
3 SVF VSEDGHW KLGGMETVCKVSQATPEFLRS IQS I RDPAS I P P 

:emspefttlpechghardafsfgtlveslltilneqvsaevls 

iFQQTLHSTLLNPIPKMRPALCTLLSHDFFRNDFLEWNFXKSL 
^KSEEEKTEFFKFLLDRVSCLSEELXASRLVPLLLNOLVPAEP 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K» Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q*Glut amine, R^Arginine, 
S-Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\ ^wooau^c iiuuicotiuc msercionj 








vav\ksflp ylu3pkkdhaqgetpcllspalfqsrvi pvllqlf 
evheehvrm vllsh i kayvgalslreqlkkv\il \ pqvllg \lr 
d\tsds i vaitlhslavlvsllgpewvgge3tki fkrtap \s f 
tk\ntdlslegdpfsqpikfpinglsdvkntsedsenfpssskk 
seewpdwsgpeVepenqtvkiXqiwpSrepxcddvksqcitldv 
eess wddceps sldtkvnpggg itatkpvtsgeqkpi palls lt 

EESMPWKS SLPQKI SLVQRGDDADQIE PPKVSSQERPLKVPS BL 
GLGEEFTI QVKKKPVKDPEMDWFADMI PE I KPSAAFLILPELRT 
EMVPKKDDVSP VMQFSS KFAAAE I TEGEAEGWEEEGELNWEDNN 
W 


5828 


2 


257 


AREGGSLGAVAAOSEI^YSCDFCPARPHTSWLTRFVKMEFQAVV 
MAVGGGSRMTDLTS S I PKPLLPVGNKFLIW Y?LNLLERVGFEEV 
I WTTRD VQKALCAE FKMKMKPDI VCI PDDADMGTADSLR Y I YP 
KL KTDVLVLS C DLI TD VALUE WDLFRAYDAS LAMLMRKGQDS I 
E PVPGQKGKKKAVEQRDFIGVDS TGKRLLFMANEADLDEELVIK 
GSILQKHPRIRFHTGLVDAHLYCLKKYIVDFLMENG\SITSIRS 
E L \ I P YL V/RGKQFS S AS SQQGTRKEKEGGS KGKRGL KS FRIS Y 
S FY* KEANYTGTGAPY\D\ACWI 


5829 


260 


1259 


PDGRLIVSCSEDKT1KIWDTTNKQCVNNFSDSVGFAMFVDFNPS 
GTCIASAGSDQTVKVWDVRVNKLLQHYQVHSGGVNCISFHPSGN 
YL I TASSDGTLKI LDLLKGRLI YTLQGHTGPVFTVS FSKGGELF 
ASGGADTQVLLWRTNFDELHCKGLTKRNLKRLHFDS P PHLLDI Y 
PRTPHPHEEKVETVEDFFLHLLRLIQSLR*SICRSLLPLLWISF 
LLI LPQQQKPWGLCQTRVKRPVDIS *TLP *CHQNVCQQPRKRK 
QKT+VTSPVKVK/VSIPLAVTDALEHIMEQLNVLTQTVSILEQR 
LTLTEDKLKDCLSNQQKLFSAVQQKS 


583 0 
5831 


4496 


3 139 


GGKMAAPEERDLTQEQTEKLLQFQDLIX3IESMDQCRHTLEQHNW 
NIEAAVQDRLNEQEGVPS VFNPPPS RPLQVNTADHRI YS YWSR 
PQPRGLLGWGYYLIMLPFRFTYYTILDIFRFALRFIRPDPRSRV 
TDPVGDrVSFMHSFEEKYGRAHPVFYQGTYSQALNDAKRELRFL 
LVYLHGDDHQDSDEFCRNTLCAPEVrSLINTRMLFWACSTNKPE 
GYRVSQALRENT YPFLAMI MLKDRRE * PV\ VGRLEGLI \QPDDL 
INQLTF I MDANQT YL VS E RLEREERNQTQVLRQQQDEAYLAS LR 
ADQEKERKKREERERKRRKKEEVQQQKLAEERRRQNLQEEKERK 
LECLPPEPSPDDPESVKIIFKLPNDSRVERRPHFSQSLTVIHDF 
LFSLKESP\EKFQIEA\NFPRR\VLPCIPSEE\WPNPPTLQE\A 
GLSHTEVLFVQDLTDE 




71 


2 897 

] 


FCSKDKCCLYLPDSINRSKSCrAKPGAHSQDRHAVMDSERQVKD 

TDDIESPKRS IRJDSGYIDCWDSERSDSLSPPRHGRDDS FDSLDS 

FGSRSRQTPSPDWLRGSSDGRGSDSESDLPHRKLPDVKKDDMS 

ARRTSHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKKAEREEYR 

KSWSTATSPAGLGKKALQDYGPRT\ PVS \DDAES TSMFDMRC3E 

EAAVQPHSRARQEQLQLINNQLREEDDKWQDDLARWKSRKRSVS 

QDL I KKEEERKKMEKLLAGEDGTSERRKSIKTYRE I VQEKERRE 

RELHEAYKWARSQEEAEGILQQYIERFTISEAVLERLEMPKXLE 

RSHSTEPNLSSFLNDPNPMKYLRQQSLPPPKFTATVETTIARAS 

VLDTSMSAGSGSPSKTVTPKAVPMLTPKPYSQPKNSQDVLKTFK 

VDGKVSVNGETVHREEEKERECPTVAPAHSLTKSQMFEGVARVH 

GSPLELKQDNGSIEINIKKPNSVPQELAATTEKTEPNSQEDKND 

GGKS RKGNIELASS E PQH FTTT VTRCS PTVAF VE FPS S PQLKND 

VSEEKDQKKPENEMSGKVELVLSQKWKPKSPEPEATLTFPFLD 

KMPBANQLHLPNLNSQVDSPSSEKSPVTTPFKFWAWDPEEERRR 

QEKWQQEQERLLQERYQ\KEQDK\LKEE\WEKAQKEVEEEERRY 

YEEEP * 1 1 \EDPWPFTVSSSSADQLSTSSSMTEGSGTMNKIDL 

GNCQDEKQDRRWKKS FQGDDS DLLLKTRES DRLEEKGS LTEGAL 

AHSGNP VS KG VHEDHQLDTE AGAPHCGTNPQLAQDPS QNQQTSN 

PTHSSEDVKPKTLPLDKSINHQIESPSERRKSISGKKLCSSCGL 

PIX3KGAAMIXETL^YFHIQCFRCG\ICKGQLGEAVSGTDVRIR 

NGLLNCNDCYMRSRSAGQPTTL 


5832 


2454 


829 

• 


PGRHFRHGSCAFQKQCIMLHICQYFLQGECKFGTSCKRSHDFSN " " 
SENLEKLEKLGMS SDL VS RLPT I YRNAED I KNKSSAPSRVPPLF 



382 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspqnding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" - 
fAaAlanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H^Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, V^ Valine, 
W=Tryptophan, Y-Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VPQGTSER KDS S GS VS PNTLS QE EGDQ I ClrYH I RKS CS FQDKCH 
RVHFHLPY KWQ&'LDRGKWEDLDNMELI EEAYCNPK2 E RILCSE S 
ASTPHSHCLNFWAMT YGATQARRLSTASS VTJCPPHF I LTTDWI W 
YWSDE FGS WQE YGRQGTVHP VTT VS SSDVEKAYLAY /WYTG V*R 
PGSHLEVPGRKAQLRVRFQSLRSEKPGLWHN+KGLPQTQIR\AP 
QDVTTMQTCNTKFPGPKSIPDYWDSSALPDPGFQKITLSSSSEE 
YQKVWNLFNRTLPFYFVQKIERVQNLALffEVYQWQKGQMQKQNG 
GKAVD ERQLFHGTSAI FVDAI CQQNFDWRVCGVHGTS YGKGS YF 
ARDAAYSHHYS KS DTQTHTM FLARVLVGE FVRGNASF VRP PAKE 
GWSNAFYDS CVNS VSDPS I FVI FEKHQVYPE YVIQYTTSSKPS V 
TPSILLALGSLPSSRQ 


5833 


170 


3289 


S I LCLLS PC WQ FGKP WS I LS SRSRHS P CTKKGWEGMRKHLHT 
RQGHK* VHVE I S KALW VYRDDY F t RHS IS VS AVI VRAW I THK YR 
GRDWNVKWEENLLHAVAKNYTLLQTI PFFERPFKDHQVCLEWNM 
G Y I WNLRANR I PQC PLEND VVALLGF P YAS SGENTGI VKKFPR F 
RNRELEATRRQRMDYPVFTVSLWLYLLHYCKANLCGILYFVDSN 
EM YGTP S VFLTEEG YLH I QMHLVKGEDLAVKTKF 1 1 PLKEWFRL 
DISFKfGGQIWTTSIGQDLKSYHNQTISFREDFHYNDTAGYFXI 
GGSRYVAG I EGFFGPLKYYRIiRSLHPAQ I FWPLLEKQLAEQ I £CTj 
YYERCAEVQEIVSVYASAAKHGGERQEACHLHNSYIiDLQRRYGR 
PSMCRAFPWEKELKDKHPSLFQALLEMDLLTVPRNQNES VS EIG 
GKIFEKAVKRLS S IDGLHQIS S I VPFLTDSS CCGYHKASYYIAV 
FYETGLNVPRPQLQGML,YSLVGGQGSERLSS>IWLGYKHYQGIDN 
YPLDWELS YAY YSNIATKTPLDQHTLQGDQAYVETIR T »KDD E I L 
KVQTKEDGDVFMWLKHEATRGNAAAQQRIAQMLFWGQQGVAKNP 
EAAIEWYAKGALETEDPALIYDYAIVLFKGQGVKKNRRLALELM 
KKAAS KGLKQAVNGLG Wy YHKFKKNYA\KAAKYWLKA\ EE \MGN 
PDASYNLGVLHLDGIFPGVPGRNQTLAGEYFHKAAQGGHMEGTL 
WCS LY YITGNLETFPRDPE KAWWAKHVAEKNG YLGHVIR KGLN 
AYLEGS WKEALLYYVIiAAETGIEVSQTNIiAJHI CEERPDIiARRYL 
GVNCVWRYYNFS VFQIDAPS FAYLKMGDLYYYGHQNQSQDLEIiS 
VQMYAQAALDGDSQGFFW1ALLIEEGTI IPHHIUDFLE I DSTLH 
SNNISILQEI.YERCWSHSNEESFSPCSLAWLYLHLRLLWGAILH 
SAL I YFLGT FLLS I L IAWTVQ YFQS VS ASDPP PRP SQAS PDTAT 
STAS PAVTPAADAS DQDQPT VTNNPE PRG 


5834 


17 


4020 

] 
] 


RFRRGGGRVFPGAFPAS PSDSIjGQGNSQGPFRTPKP PRTyQECG " 
SAAPGP I PGQSSS * VPLRLEQIQQKADCPLSLEIiALKPRMAAQV 
TLEDALSNVDLLEELPLPDQQPCIEPPPSSLLYQPNFNTNFEDR 
NAFVTG IARYIEQATVHS SMNEMLEBGQEYAVMLYTWRSCSRAI 
PGVKCNEQPNRVEIYEKTVEVLEPEVTKLMNFMYFQRNAIERFC 
GEVRRLCHAERRKDFVSEAYLITLGKFINMFAVLDELKNMKCSV 
KNDHSAYKKAAQFLRKMADPQSIQESQNIjSMFLANHNKITQSLQ 
QQLBVISGYEELLADIVNLCVDYYENRMYLTFSEKHMLLKVMGF 
GLYLMDGS VS NI YKLDAKKR INLSKI D KYFKQLQ WPLFGDMQ I 
ELARYIKTSAHYEENKSRWTCTSSGSSPQYNICEQMIQIREDHM 
RFISELARYSNSEWTGSGRQEAOKTDAEYRKLFDLAUJGIiQLI. 
SQWS AHVMEVY S WKLVHP TDIGTSNKD CPDS AEE YERATRYN YT S 
EEKFALVEVIAMI KGLQVLMGRMES VFNHAIRHTVYAALQDFSQ 
VTLMEPLRQAI KKKKNVIQS VLQAI RKTVCDWETGHEPFWDPAL 
RGEKD P KSG*D 1 KVPRRAVGPS STQL YMVRTMLE SLI ADKS GS K 
KTLRSSLEGPTI LD I EKFHRES FFYTHL INFSETLQQCCDLSQli 
WFREFFLELTMGRRXQFPIEMSMPWILTDHILETKEASMMEYVL 
YSLDLYNDSAHYALTRFNKQFLYDEIEAEVNLCFDQFVYKLADQ 
X FAY YKVMAGS LLLDKRLRS E CKNQGAT I HLPPSNR YETLLKQR 
HVQLLGRS IDLNRL I TQRVS AAMYKSLELAIGR FES EDLTS I VE 
LDGLLE INRMTHKLLSRYLTLDGFDAMFREANHNVSAP YGR I TL 
HVF WELNYDFTtPNYC YNGS TNRF VRTVLPFSQE FQRDKQPNAQ P 
QYLHGSKALNLAYSS I YGS YRNFVGPPHFQVICRLLGYQGIAW 
MEELLKVVKSLLC^TILQYVKTbMEVMPKICRLPRHEYGSPGIl. 
EFFHHQIiKDIVEYAELKTVCFQNIjREVGWAILFCLLIEQSLSIiE 
EVCDLLHAAPFQNILPRVHVKEGERI^AKMKRLESKYAPUiLVP 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spending 

to first 

amino acid 

residue of 

amino acid 

sequence 


~" Predicted end 
nucleotide 
location 
corresponding 
to " first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=*Alanine, OCysteine, D=Aspartic Acid, 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^JLeucine , M-Me thioni ne , N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyxosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








DIERLGTPUQIAIAREGDLLTKERLCCGLSMFEVILTRIRSFI^"" 
DPIWRGPLPSNGVMHVDECVEFHRLWSAMQFVYCIPVGTHEFTV 
EQCFGDGLHWAGCMIIVLLGQQRRFAVLDFCYHLLKVQKKDGKD 
EI I KNVPLKKMVERI RKFQ I LNDE I IT I LDKY LKSGDGEGT PVE 
HVRCFQPPIHQSLASS 


5835 


4209 


1904 


SGNIRMAQGSHQIDFQVLHDIiRQKFPEVPEVWSRCMLQKKNNL 
DACCAVLSQESTRYLYGEGDLNFSDDSGISGLRMrHMTSLKLDLQ 
SQNIYHHGREGSRMNGSRTLTHSISDGQLQGGQSNSELFQQEPQ 
TAPAQVPQGFNVFG.MSSSSGASNSAPHLGFHLGSKGTSSLSQQT 
PRFNPIMVTLAPNIQTGRNTPTSLHIHGVPPPVLNSPQGNSIYI 
RPYITTPGGTTRQTQQHSGWVSQFNPMNPQQVYQPSQPGPWTTC 
PASNPLSHTS5QQPNQQGHQTSHVYMPISSPTTSQPPTIHSSGS 
SQSSAHSQYKIQNISTGPRKNQIEIKLEPPQRNWSSKIiRSSGPR 
TSSTSSS VNSQTLNRNQPTV YIAASP PNTDELMSRSQPKVYI SA 
NAATGDEQ VMRNQ P TLF I S TWSGASAAS RNMSGQ VSMGPAF I HH 
HPPKSRAI GNNSATS PR VWTQPNT\ EYTFKITVS PNKPPAVS P 
GWS PTFE LTNLLNHPDH YVETEN IHHLTDPTLAHVDRISETRK 
&SMGSDDAAYTQD I *RISNS WLGM VAHACNS SALGGQDGR 1 1 + A 
Q EFET S WGNI WRLRLYRRF *NYAGMVAHTCSFS YSVD * ALLVHQ 
KARMERJbQRELEIQKKKLDKLKSEVNEMENNLTRRRLKRSNSIS 
Q I PSLEEMQQLRSCNRQLQ I D I DCLTKE I DLFQARG PHFNPS AI 
HNFYDN I G FVGP VP P KPKD QRS 1 1 KT PKTQDTBDDE GAQWNCTA 
CTFLNHPALI RCEQCEMPRHF 


5636 


361 


2303 


FrfITMCGl6csVNFSAEHFSQDLKEDLLYNLKQRGPNSSKQLLK 
SDVNYQCLFSAHVLRXiRGVLTTQPVEDERGNVFLWNGElFSGIK 
V3AEENDTQILFNYLSSCKNESEILSLF5EVQGPWSFIYYQASS 

hylwfgrdffgrrsllwhfsnlgksfclssvgtqtsglanqwqe 
vp as \ d fs e l ils llis fpdalf ynci lgni flgr i llkkml i a* 
v:<fqqtyqhlyqr*qmkpncilknllfl*i*cx:hklhwrliavi 
fpmchlqer yfks fllmyt * keviqqfi dvlsvavkkr vlcl pr 
denltanevlktcdrkanvailfsggidsmviatladrhiplde 
p idiilnvafiaeektmpttfnregnkqknkcei ps eefs kdvaa 

AAADS PNKHVSVPDR I XGRAGLKELQAVS PSRI WNFVE I NVS ME 
EIiQKLRRTRICHL I R PLDTVLDDS IGCAVWFASRG IGWL VAQEG 
VKSYQSNAKWLTG I GADEQLAGYSRHRVRFQSHGLEGIjNKE IM 
MELGR I SS RNLGRDDRVIGDHG KE ARF P FLDENVVS FLNSL P I W 
EKANLTIiPRGIGEKbLLRIiAAVSLGLTAS ALLPKRAMQFGSR IA 
KMEKINEKASDKCGRLQIMSLENLSIBKETKL 


5837 


4792 


903 

] 

] 

J 


NGNAVAQAPVTNCC YLATGSKDQTIRI WSCSRGRGVMI LKLPFXi 
KRRGGGIDPTVKERLWLTLHWPSNQPTQLVSSCFGGELLQWDLT 
QSWRRKYTLFSASSEGQNHSRIVFNLCPLQTEDDKQLLLSTSMD 
RDVKCWDI ATLECS WTLPSLGGFAYStiAFS S VDIGS LAI G VGDG 
MIRVWNTLS I KNKYDVICNFWQGTOSKVTALCWHPTKEGCLAFGT 
DDGKVGL YDTYSNKP PQISS TYHKKTVYTLAWGP P VPP MSLGGE 
GDRPSLALYSCGGEGIVLQHNPWKLSGEAFDINKLIRDTNSIKY 
KLPVHTE I SWKADGKIMALGNEDGSIE I FQ\ I PNLKLI CTIQQH 
HKLVWTISWHHE\HGSPAQKLSYL\MPSGSQQCSPFTCttNLKKC 
P* KAAPES PSDPLQS PYRTPPQGHTAQDYPVWAWEPHIH * WEGL 
VFCFPIDGYSPGCWD\AFPGKEAPVAIFRG\HQGRLLCVAWSPL 
DPDCI YSG \ ADDFCVHKWLTS MQDHSRP PQGKKS IE LEKKRLS Q 
PKAKPKKKKKPTLRT P VKTiES I DGNEEE SMKEMSGP VBNG VSD Q 
EGEEQARE PELPCGLAPAVS REP V I CTP VS SGFEKS KVT INNKV 
I LLKKEP P KEKPE TL I KKRKARS LLPLS TS LDHRSKEELHQDCL 
VLATAKHS R E LNE D VS AD VE ERFHLGL FTDRAT L YRM I D I EG KG 
HLENGHPELFHQLMLW KGDLKGVLQTAAE RGELTDNLVAMAPAA 
3YHVWLWAVEAFAKQLCFQDQYVKAASHLLS IHKVYEAVELL.KS 
tJHFYREAIAIAKARLRPSDPVLKDLYLSWGTVLERDGHYAVAAK 
CTYLGATCAYDAAKVIiAKKGDAAS LRTAAELAAI VGEDELSASLA 
LRCAQELLLANNWVGAQEALQlxHESLiQGQRLVFCLLELLSRHLE 
SKQLSEG KS S S S YHTWNTGTEGP FVERVTAVWKS I FSUDTPEQ Y 
JEAFQKLQN I KYPSATKNTPAKQLI*LHI CHDLTLAVLSQQMAS W 
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beginning 
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amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
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to first 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid ( F= Phenyl alanine, G=Glycine, 
H=Histidine, 3>Isoleucine, K» Lysine, 
L- Leu c ine , MwMethi oni ne , N=As par agine , 
P*Proline, Q^Glutamine, R=Arginine / 
S= Serine, ^Threonine, V^Valine, 
W«Tryptophan, Y=Tyrosine r X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








DEAVQALLRAWRS YDSGSPT 1 MQE VYSAFLPDGCDHLRDKLGD - " 

HQSPATPAFKSLEAFFLYGRLYEFWWSLSRPCPNSSVWVRAGHR 

TLSVEPSQQLDTASTEETDPETSQPEPNRPSEIjDIJ^LTEEGERM 

LSTFKELFSEKHASLQWSQRTVAEVQETLAEMrRQHQKSQIiCKS 

TANGPDKNEPEVEAEQPLCSSQSQCKEEKNEPLSLPELTKRI/TE 

ANQRMAKFPES I KAWPF PDVLB CCLVLLL I RSH F PG CLAQ EMQQ 

QAQELLQKYGNTKTYRRHCQTF CM 


S838 


no 


98 


KTMPHLLVT FRDVAI DFS QEE WE CLDP AQRDL YRD VMLEN YSNL 
ISLDLESSCVTKKLS PEKE I YEMES \ PSGRI WGNVST3 TFQYNG 
LGDNMECKGNLEGQVSKSEGL YMCVKI TCBE KATESHSTS STFH 
RI I / H YQGK I VKCKE CRQG FS YLS CL I QHEEWHN I * KCSEVNKH 
RNTFS KKPS Y I * HQ \ KFRLGEKP YE CMECGXAFG RTS DL I QHQK 
XHTNEKPYQCNACGKAFIRGSQLTEHQRVHTGEKPYDCKKCGKA 
FS YCSQYTLHQR IHSGEKPYEC KDCGKAFILGSQLTYHQR IHSG 
EKP YE CKECGKAF I LG SHLrYHQRVHTGE KP YI CKECGKA FLCA 
SQLNEHQRIHTGSKPYECXECX3KTFFRGSQLTYHLRVHSGERPY 
KCKE CGKAF I SNSNL I QHQRIHTGE KP YKC KECGKAF I CG KQLS 
EHQRIHTGEKPFECKECGKAFIRVAYLTQHEKIHGEKHYECKEC 
GKTFVRATQLTYHQR IHTGEKPYKCKECDKAF/HLWLT I LSEHQ 
RIHRGEKPYECKQCGR/LFIRGSHL/NEHLRTHTGEKPYECKEC 
GRAFSRGSEHTLHQRIHTGEKPYTCVQCGKDFRCPSQLTQHTRL 
HN*EYSSHKICMHSIALASLDFAHLQEKNPEN 


5839 


1 


2425 


GRP F P R PPRALPRLPLRGRRQDGRWT VDFE BCLKD\ SPRFRAAL 
EEVEGD VAELELKL\DKLVKLCI A\MI DTGKAFCVANKQFMN3 1 
rd\laqns \NNDA\ WETKFAPS FLDSLQEMINFHTIL/L* pns 
E IN*GHS FQNF VKEDLRKFKDAKKQFE Jf SQ * KRKKIALVKNAPV 
PSRPASLEL*KPPNILTATRKCFRHIALDYVLQINVLQSKRRSE 
ILECSMLSFMYAHLAFFHQGYDLPSELGPYMKDLGAQLDRLVGDA 
AKEKREMEQKHST I QQKDFSRDDS KLKYNVDAANG I VMEGYLFK 
RASNAFKTWNRRWFSIQNHQVVYQKKFKDNPTVVVEDLRLCTVK 
HCED I ERRFCFE W5 PTKS CMLQAD SE KLRQAW I KAVQTS I \ AT 
AYRBKDDESEKLDKKSSPSTGSLDSGNESKEKLLKGESALQRVQ 
CIPGNASCCDCGLADPRWAS INLGITLCIECSG IHRSLGVHFS K 
VRSLTLDTMEPELLKLMCELGMDVINRVYEANVEKMGIKKPQPG 
QRQEKEAYIRAKYVERKFVDKIFL* SLSPP\EQQKK\ FVS KSSE 
EKRLSISKFGP\GDQVRASAQSSVRSKDSGIQQSSDDGRBSLPS 
TVSANS L YE P EGERQDSSMFLDS KHLNPGLQL YRAS YEKKLP KM 
AEALAHGADVNWANSEENKATPLIQAVLGGSLVTCEFIiQNGAN 
VNQPJ5 VQGRGPLHRATVLGHTGQ VCLFLKRGANQHATDEB GKD P 
LSrAVEAANADIVTLLRLARMNEEMRESEGLYGQPGDETYQDIF 
RDFSQMASNNPEKLNRFQQDSQKF 


5640 


698 


3610 


KHIiHL PRQHLTTLWQ I S S PRWRS PQRAFMSALS KTQTQSAPALQ 
GLSSLLQSVTGNPVPASEAASQSTSASPANTTVYTIKGRNLPSS 
AQPFI PKSFNYS PNSSTSEVSSTSASKAS IGQSPGLPSTAFKLP 
SNTKGFTATHNTS PAAPPTE VTICQS SE VSKPKL\ESESTSPS L 
\2MKIHNFLKGNPGFSVA*NLKHPNPAGSLGSSAPSESHPSDFQ 
RGPTS TS IDN I DGTP VRDE RSGTPTQDEMMDKPTS SS VDTMSLL 
SKIISPGSSTPSSTRSPPPGRDESYPRELSNSVSTYRPFGLGSE 
SPYKQPSDGMERPSSLKDSSQEKFYPDTSFQEDEDYRDFEYSGP 
PPSAMMNLQKKPAKS ILKSS KLSDTTE YQPILSS YSHRAQEFGV 
KSAFPPSVRALLDSSENCDRLSSSPGLFGAFSVRGNEPGSDRSP 
SPSKNDSFFTPDSNHNSLSQSTTGHLSLPQKQYPDSPHPVPHRS 
LFSPQNTIiAAPTGHPPTSGVEKVLASTISTTSTIEFKNMLKNAS 
RKPSDDKHFGQAPSKGTPSDGVSLSNLTQPSLTATDQQQQEEHY 
RI ETRVSSSCIjDL PDSTEEKGAP IETLG YHSASNRRMSGEPIQT 
VE S IRVPGK3NRGHGREASRVGWFDLS TSGSS FDNGP SS AS ELA 
SIiGGGGSGGLTGFKTAPYKERAPQFQESVGSFRSNSFNSTFEHH 
LPPSPLEHGTPFQRE P VGPSSAFPVPPKDHGGI FSRDAPTHLPS 
VDLSNPFTKEAALAHAAPPPPPGEHSGIPFPTPPPPPPPGEHSS 
SGGSGVPFSXPPPPPPPVDHSGWPFPAPPLAEHGVAGAVAVFP 
KDHSSLLQGTLAEHFGVLPGPRDHGGPTQRDLNGPGLSRVRESL 
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Predicted end 
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Amino acia segment containing signal peptide" 
CA=Alanxne, C=Cysteine, D=*Aspartic Acid, E= 
^j.ucamic Acxd, ^Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
T^Leucine, M=Methionine, N«Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S^Serine, T«Threonine , V=Valine, 
W^Tryptophan, Y=Tyrosine, X= Unknown, **stop 
Codon, /=possible nucleotide deletion, 
_ \ s PQssible nucleotide insertion) 


5841 






TbPSHSLEnLGPPHGGGGGGGS-MSSSGPPLGPSHRPTI SRSGI I 
^P RPDFRPR EPF^SRDPFHSLKRPRPPPARGPPFFAPKRPFF 


5842 


1908 


762 


GUiLFLVLTvwPMMKPSWLSRTEFSKRLLCRTLWCQSGWSSRSY 
TRSMLKMTTS INRRSRTS TKSTRTSARPGLTATVS IGLSDS PTW 
RHCWMTARS CSGEKGGHWAPRQ VGV Y LLPGRVG CVS S R VS PSFP 
GDGLDSGLARRGSAVSAIiASGLVESPMLGPPFHPTPRFKAVSAK 
SKE DLVS QGFTE FTI EDFHNT FMDL I EQ VEKQTS VADLLAS FND 

QSTSDYLWYLRLLTSGYLQRESKFFEHFIEGGRTVKEFCQ\QE 
\VEPMCKESDHIHIIAIAQGLQRVHPGWEYMGPRPRAATTMPHT 
FP * G LPS P KVYLLYRPG \H YD I L YKI GLGS S PLGCPGCPLLARA 
LGHCYRGFSVWKWSYFTPFFIiSHDPpPMFY 


5843 


307 


1918 


QEPTADFKLKSTCGCGREMTCPDKPGQLINWFI CSLCVPRVRKL 

WS SRRPRTRRNLIjLGTACAI YLG FLVSQVGRAS LQHGQAAE KGP 

HRS RDTAEPS F PE I PLDGTIAP PES QGNGSTLQPNVVY I TLRSK 

RS KPANIRGTVKPKRRKKHAVASAAPGQEALVG PSLQ PQEA\ EG 

KLML * HLGTLREQTWLRLESDPGGW CG VR E / WRAGGPDFLQPS S 

RESNIRI YS ESAPSWLSKDDIRRKRLLADSAVAGIjRPVS SRSGA 

RLLVLEGGAPGAVLRCGPS PCGLLKQPLDMS EVFAFHLDR I LGL 

NRTLPS VSRKAB F IQDGR P CP 1 1 LWDASLS S AS WDTHS S VKLTW 

GT YQQLLKQ KCWQNGR VPKPESG CTE I HHHBWSKMALFD FLLQ I 

YNRLDTNCCGFRPRKEDACVQNGLRPKCDDQGSAALAHI IQRKH 

DPRHLVFIDNKGFFDRSEDNLNFKLLEGIKEFPASAVYVLKSQH 

LRQKLLQSLFLDKGYWESQGGRQGIEKLIDVIEHRAKILITYIN 
AHGVKVLPMNE 


£844 


500 


1453 


GTARLVTCWVLHGQ* VKJCPAWEPGWHL*Q*RCRPKGWGLGAGM " 

R3SRMSQPPQCLRRAQSSCCHFMVKt,LDDGTFMiPGEKVAHTSL 

DAIiVT FHQQKP IEPRRELLTQ PCRQ KDPANVD YEDL FL YSNAVA 

EEAACPVSAPEEASPKPVLCHQSKERKPSAEM/RQNNHQGSHFL 

LPPKIPSWRDPPETLEEPQNAPRERPEGPAAAKKPPRHCBLWT 

LGCPEIHGDLRPWDRKRQPRSLRGSHLGGQRLHGSLCGHI3QKP 

VQA^P^G^ PHQEGRBVG ^* GDPRGQE ^ PWGSESP 


5845 


202 


2471 


* l^yAVLSS INVMAVLPGPLQLLG VLLTISLiS SIRIiIQAGAY YG I 
KPLPPQIPPQMPPQIPQYQPLGQQVPHMPIAKDGLAMGKEMPHL 
QYGKEYPHLPQYMKSIQPAPRMGKEAVPKKGKEI PLASLRGEQG 
PRGEPGPRGPPGPPGLPGHGIPGIKGKPGPQGYPGVGKPGMPGM 
PGKPGAMGMPGAKGEIGQKGEIGPWG1P * PQGPPGPHGkPG I GK 
PGGPGLPGQPGPKGDRGPKGLPGPQGLRGPKGDKGFGMPGAPGV 
KGPPGMHGPPGPVGLPGVGKPGVTGFPGP\QGPLGK\PGAPGEP 
GPQGPIGVPGVQGPPGIPGIGKPGQDGMPGQPGFPGGKGEGGL 
PGLPGPPGIjPGIGKPGFPGPKGDRGMGGVPGAIjGPRGEKGPIGA 
PGIGGP PGE PGLPG I PGPMGPPG AIGFPGPKGEGG I VG PQG PPG 

pkgepgi^gfpgkpgflgevgppgmrgfpgpigpkgehgqkgvp 

GLPGVPGLLGPKGEPGI PGDQGLQGPPGI PGIGGPSGP IGP PG I 

pgpkgepglpgppgfpgigkpgvaglhgppgkpgalgpqgqpgl 
pgppgppgppgppavmpptpppqgeylpdmglgidgvkpphayg 
akkgknggpayempaftaeltapfppvgapvkfnkllyngrqny 

NPQTGrFTCEVPGVYYFAYHVHCKGGNVWVALFKNNEPVMYTYD 

EYKKGFLDQASGSAVLLLRPGDRVFU2MPSEQAAGLYAGQYVHS 
SFSGYLLYPM 




215 


2061 

( 

1 
i 
£ 
C 
J 
E 
E 


tiASNKSAS LQDKWANP KEKTAMCLVNELARFNRVQPQYKLLNER 
3PAHS KMFS VQLSLGEQT WES EGSS I KKAQQAVGNKAI/TESTLP 
KPI*KPPKSWVNNNPGCITPTVELNGIjAMKRG\KPAIHRPtiDPK 
3 FPNNRANYNFQVMYNQRYHCP I PKI FYVQLTVGNNE FFGEGKT 
iQAARHNAAMKALQALQNEPI PERS PQNGES GKDMDDDKDANKS 
:iSl ) VFEIALKRNMPVSFEVIKESGPPHMKSFVTRVSVGEFSAE 
3EGNSKK^KKRAATTVLQELKKLPPr.PVVEKPK\HFFKKRPKT 
[VKAGPEYGQGMNPISRLAQIQQAKKEKEPDYVLLSERGMPRRR 

:fvmqvkvgnevatgtgpnkkiakknaaeamllqlgykastnlq 

K5LEKTGBNKGWSGPKPGFPEPTNHTPKGI LHLS PDVYQEMEAS 
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to first 
amino acid 
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amino acid 
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Predicted end" 
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location 
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to first 
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residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
II=Histidine, I=Isoleucine, K=Lysine, 
« iAiu^±iic, n-i'iecmonme, w— Asparagme 
P= Proline, Q=Glutamine , R=Arginine , 
S=Serine, ^Threonine, V=Valine, 
"^Tryptophan, Y=Tyrosine, X=*Unknown, *=>stop 
Codon, /-possible nucleotide deletion, 
\spossible nucleotide insertion) 








WGTSSTAEAIGLRGSSPTPPCSPVQPSKQLEYJLARIQGPQVHYC 
DRQSGKECVTCLTLAP VQMTFHAIGSS I EASHDQ V* YAT AIL LC 

YGPARKWKAIKMEAMCAHAALLSLIHYLIAPSARLBKSKLFALG 
N- 


5846 
5847 


1126 


456 


FSKLIKKTFl IGISGVTNSGKTTLAKNLQKHLPNCS VI SQDDFF 

KPESEIETDKNGFLQYDVLEAIiNMEKMMSAISCWMESARHSVVS 

TDQESAEEIPILIIEGFLLFNYKPLDTIWNRSYFLTIPY3ECKR 

RRSTRVYQPPDSPGYFDGHVWPMYLKYRQEMQDITWEWYU3GT 

KSEEDLFLQVYSDLIQELAKQKCLQVTA*RRNTTNPfi/CK*IRK 
LQGVI 


5848 


2769 


SOS 


APEMEDLSSPDSTLLQGGHNLLSSASFQBSVTFKDVIVDFTQEE 
WKQLDPGQRDLFRDVTLENYTKLVS I GhQVS KPDVI SQLEQGTE 
PWIMEPSIPVGTCADWETRLENSVSAPEPDISEEELSPEVIVEK 
KKRDDSWSSNLLESWEYEGSLERQQANQQTLPKEIKVTEKTIPS 
WE KGPVNNE FGKS VNVS SNL VTQE P S P EETS TKR S I KQWSNP VK 

KEKSCKCNECGKAFSYCSALIRHQRTHTGEKPYKCM*/CVEKAF 

SRSENLINHQRIHTGDKPYKCDQCGKGFIEGPSLTQHQRIHTGE 

KPYKCDECGKAFSQRTHLVQHQRIHTGEKPYTCNECGKAFSQRG 

HFMEHQKIHTGEKPPKCDECDKTFTRSTHLTQHQKIHTGEKTYK 

CNECGKAFNGPSTFIRHHMIHTGEKPYECNECGKAFSQHSNLTQ 

HQKTHTGEKPYDCAECGKSFSYWSSLAQHLKIHTGEKPYKCNEC 

GKAFSYCSSLTQHRRIHTREKPFECSECGKAFSYLSNLNQHQKT 

HTQEKAYECKECGKAFIRSSSLAKHERIHTGEKPYQCHECGKTF 

'SYGSSLIQHRKIHTGERPYKCNECGRAFNQWIHLTQHKRIHTGA 

KPYECA2CGKAFRHCSSLAQHQKTHTEEKPYQCNKCEKTFSQSS 

HLTQHQRIHTGEKPYKCNECDKAFSRSTHLTQHQRIHTGBKPYK 

CNECGKXTFSQSTYLIQHQRIHSGEKPFGCNDCGKSFRYRSAZ.N 
KHQRLHPGI 


S849 " 


22 


2961 


AAPRRLLRGGDGDRTPKFPLPALLRPGPPAKAAPERRKMPAVSK 
GDGMRGIiAVFI SD I RNTCKSKEAE IKR I NKEtAMI RS KFKGDKAL 

D3YSKKKYVCKr.LFXFLLGHDIDFGHMEAVNLLSSKRYTEKQIG 
YLF I S VLVNSNS EL I RL INNAI KNDLAS RNTPTFMGLALHC I AS V 
GSREMAEAFAGEIPKVLVAGDTMDSVKQSAALCLLRt.YR'ISPDL 
VPMGDWTSRVVHLLNDOJU»GWTAATSLITTLAQ£QNrPEEFICTSV 
S LAVSRLS \ RIVTSAS TDLQD YTY* FCPG FLGLS VKLLRLLQC Y 

PPPDPAVRGRLTECLETILNKAQEPPKSKKVQHSNAKNAVLFEA 
ISLIIHHDSEPNLLVRACNQLGQFLQHRETNLRYIALESMCTLA 
S S EFSHEAVKTH I ETVINALKTERDVS VRQRAVDIjL YAMCDRSN 
A?Q I VAEMLS YLETAJDYS IREE I VLKVA I LAEKYAVDYTW\ YVD 
TI IiNLIR I AGD YVSEEVWYRVI Q IV1NRDDVQGYAAKT VFEALQ 
APACHENLVECVGGYTLGPT?fIMT,Tannr)t>eonT T^TJir-r T > Tn » m r* 

CSVI^RALIiSTYIKFVNLFPEVKPTIQDVLRSDSQLRWADVEL 
QQRAVEYLRLSTVASTDIIATVLEEMPPFPERESSILAKLKKKK 
GPSTVTDL EDTKRDRS VDVNGGPE PAPASTSAVS TPS PSADLLG 
LGAAPPAPAGPPPSSGGSGLLVDVFSDSASWAPLAPGSEDNFA 
RF VCKNNGVLFENQLLQ IGLKS E FRQNLGRM F I F YGWKTSTQFL 
NFTPTLICSDDLQPNLMIK3TKPVDPTVEGGAQVQQWNIECVSD 
FTEAPVLNIQFRYGGTFQNVS VQLP ITLNKFFQPTEMAS QDFFQ 
RWKQLSNPQQE VQN I FKAKHPMDTE VTKAKI IGFGSALLEEVDP 
NPANFVGAG 1 1 HTKTTQ I GCLLRL E PNLQAQMYRLTLRTS *<EAV 
SQRLCELLSAQF 




3545 


1895 } 

] 

] 
( 
I 


KRREIKETVFHHVAQAGLELLSSSKPPSSASRSAGITtiMRHQVQ 
P*DPCMSLSPPCFTEEDRFSLEALQTIHKQMDDDKDGGIEVEES 
□EFIREDMKYKX>ATNKHSHLHREDKHr TIEDLWKRWKXSEVHNW 
rLEDTLQWLIEFVELPQYEKNFRDNNVKGTTLPRIAVHEPSFMI 
5 QLK I S DRSHRQ KLQLKALDWLFGPL TRP PHNWMKDFI LTVS I 
/ 1 G VGG CW F A YTQNKTS KEHVAKMM KD LES LQ TAEQS IjMD LQ ER 
-.EKAQEENRNVAVEKQNL^RKMMDEINYAKEEACRLRELREGAE 

:elsrrqyaeqel,eqwmalkkaekefelrsswsvpdalqkwlq 

jTHEVEVQYYNIKRQNAEMQLAIAlmEAEKIKKKRSTVFGTLHV 
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Predicted end 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K^Lysine, 
L=I*eucine, M=Methionine, N-Asparagine, 
P^Proline, Q=Glutaraine , R=J=Lrginine, 
S=Serine, ^Threonine, V«Veaine, 
"""Tryptophan, Y=Tyrosine, X= Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AHS SS LDEVDH KHjEAKKALSELrTCLRERL?RWQQ I SK I CGFQ~~ 

1AHNSGLPSLTSSLYSDHSWVVMPRVS I PPYPIAGGVDDLDBDT 

PPIVSQFPGTMAKPPGSIARSSSLCRSRRSIVPSSPQPQRAQLA 

PHAPHPSHPRHPHHPQHTPHSLPSPDPDILSVSSCPAIjYRNBEE 

EEAIYFSAEKQMEVP0TASEGDSLNSSIGRKQSPP/SKPRDIPN 

IIS/DERYQEMRCP*RIPSGGIL 


5850 


3 


1895 


KAVLNFSASGS VIS LTGSMPMHDASMWHLKKNGI I VYLD VPLLN 
LI CRIiKLMKTDRI VGQNSGTSMKDLLKFRRQYYKKW YDARVFCE 
S GAS P E E VADKVLNAI KR YQDVDS ETFI S TRHVWPED CE QKVSA 
EFFIEAVIEGLASDGGLFVPAKEFPKLSCGEWKSLVGATYVERA 
Q ILLERCIHPADI PAARLGEMI ETAYGENFACS KIAPVRHLSGN 
QFILELFHGPTGSFKDLSLQLMPHIFAQCIPPSCNYMILVATSG 
DTGSAVLNGFSRLNKNDKQRIAWAFFPENGVSDFQKAQ I IGSQ 
RENGWAVGVESDFDFCQTAI KR I FNDSDFTGFLTVE YGTI LSSA 
KS INWGRLLPQWYHASAYLDLVSQGFIS FGS P VDVCI PTGNFG 
KIXiAAVYAKMMGIPIRKFI CASNQNHVWTDFIKTG\HYDIjRGKE 
N*AQTFFTVQ*IFLPNLSKFLERHLHLMANKDGQLMTELFNRLES 
QHHFQIEKALVEKLQQDFVADWCSEGECLAAINSTYNTSGYILD 
PHTAVAK WADR VQDKTCP VI XSS TAH YS KPAPAIMQALKIKE I 
NET S S S QLYLLGS YNALPP LHEALLERTKQQEKME YQVCAAE*4N 
VIiKSHVEQLVQNQFI 


5651 


3120 


1802 


RCYLQFLALLLTSTSARAAAAIAAAEEPAGSPSVMTRAGDHKRQ 
RGCCGSLADYLTSAKFIiLYIiGHSLSTWGDRMWHFAVSVFLVEIjY 
GN S LLLTAVYGL WAGS VLVLGAI I GD WVDKNARLKVAQTS LW 
QNVSVILCG I ILMM VFLHKHELLTM YHGWVLTS CYI LI ITTANI 
ANLASTATAITIQRDWIVWAGEDRSKIiANMNATIRRIDQIjTNI 
LAPMAVGQIMTFGSPVIGCGFISGWI^LVSMCVBYVIjLWKVYQKT 
PALAVKAGLKEEETELKQLNLHKDTEPKPLEGTHLMGVKDSNIH 
ELEHEQEPTCAS QMAEPFRTFRDGWVS YYNQPVF / LGWHGS CFP 
LYDCPGL* LHHHRVRLHSGTEWFHPQ YFDG5 1 S YNWNNGNCS FY 
LATSKMWFGSDRSDLRIGTAFLFDIiVCDLCIHAWKPPGIiVRFSF 


5852 


1 


422 


KTTFPSSLCPLRQLPEVRGYSGQPLTDPLISLCRSHKCRGKGWG 
SSS YPSLPALLRARSAPGHCTHRSCGPEWRIDS I SRLEMQGA3R 
SGWAQAQPTILLLVPRLRKSLPSIWG/SLMGFFITSGPG/WFRQ 
YYFFI SGRH+ VLFTESDFY YVAMDFGGHGL3SHYS PGVP YYXQr 
F VS E I RRWAGKKQS VYFRRCGGCS RAP PIiI TGGGVGSRKQRWP 
ESGAWALAPGLPAIHGRSWES 


5853 


223 


1346 


R LLGLS RVKGLHGPAASAW IS DPETRGD PGGPWGM WRG&DL RPR 
PVSLTGLTLVCK*AAQGPQV\HSVKLCFGLGG\PCLL\FPIFRP 
LLLEPRRPRLHPGTRGVAVEPHALRWHVAHGEEAGIRAAGPGH 
GGVEIPQG/VGSLGARRGLRPSRPSSRHRNRVPAPPPGRPLATP 
HRR R FP PD PALTCPGLGQDQGPRE QQKQGSGRHDT I LGDWGES E 
SRWVRGNFRTGTAATLIGFSRNPTLNGS ENWGSLVS IQEEGPDT 
GWEREKRNPAEMGNPQRWASPIHTPPLGPEIIiRAMPEALRAMPE 
AIiGLRPDPArSVPSALS/QTF/PESWPRSCLRNQGETLGMGPVP 
LSSLCITESPSQNWTPCLLLLTCPRGLF 


5854 


86 


938 


KGRNTAPEKKXSAAI^RENASS *NGY/SRWKQDIRRIENHI IQE 
LXHLCAMIKRVLLERLENTRKLRELTEGRTLDWPQWRITEVSAX 
RQI VTE YREKGKRN* EEKKRDLEGRSRRYNLCI IG I PETEDRAS 
GAET I KDLLE/ENFPEIi KNErinrjOMP ft a up t pt .wpwtt w a n o t> u 

I R VTFI*/ KFQRRNI LQASSQRKQVTYKG AKVRLTS DFS PAI LNA 
RRQW/N/PISRVLRENNFEPRIIYSAKLSFLYKGNWKTFLDIQG 
LGKYINQELSLKILLKDLLQLTENLN 


5855 


536 


2391 


LRSYGCKAPSRISHLHK\FLFLLI*PSLLMGYSESPPPITDSWAP 
FISLTHHVLSQSQSPLSSNCWICLSIHTQ*FTALPADI»LTWTQS 
NVSLHI S YLAIPFLADS FLKPV/L * PGKSAKHLSFKLS SLSMVS 
GRAVALLHLIASGI.TS IQTNTASS KPPI WGY\LSTQTS FI SPPP 
IiCLS RTYPNPAHATMVGQVPQS LCGLI FTL/ RTP CRPS I LH PNY 
KI I STSAWQKVLCFSGSPTIHTSLHLTTGSS FLSFHP I PGFPAA 
NSALYVSSLJCGPPGKNVTIPSPVTGT*QPPHRGSN/RLTVDKDN 
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SEQ 
ID 
liO: 


Predicted " 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

amino ar*4*3 
sequence 


Ammo acxd segment containing signal pept"i"dV~ 
(A=Alanine, ^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, 10=Lysine, 
L= Leu cine, M-Methionine, N»Asparagine , 
P=Proline, Q^Glut amine, R-Arginine, 
S=Serine, T=Threonine, V^Valine, 
W*=Tryptophan, Y^Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FFLSPKPNSliHQLPSQ\TPYQAtiTGAALAGSYPIWENENTLSWL 
PTPI*yKFCLSTPSLFFLCDTN*YLCLPANWSGTCTLVPOAPTIN 
ILPPNQTILISVBASISSSPIRNKWALHLITLLTGLGXTAALGT 
GIAGITTSITSYQTLFTTIiSNTVEDMHTSITSLQRQLDFLVGVr 
LQNWRVLDLLTTEKGGTCIYLQEECCFCVNESGIVHIAVRRLHD 
RAAEL* HQVADS WWQGSSLLRWI PWVAPFLGPLI FLFLLLM I GP 

CIFNLVSRFlSQRLMCFIQASMQKHIDNIFHU3IV*YQSLRONH 
SEAPEPRP 


5856 


173 


1137 


PWLHGLGLaAVFLFYL* / YVTFHLYGGI ILLLLIFISIAGILYK 
FQD VLL YFPEQPSS SRL Y VPM PTG I PHENI F I RTKDG X RLNLI I> 
IRYTGDNSPYSPTIIYFHGNAGNXGHRLPNALLMLVNLICVNLLL 
VDYRG YGKSEGEASEBGIj YLDSEAVLDYVMTS PDLDKTKI YLSG 
RSLG\GAAAIHIiASDNSHRISAIMVENTFIiSIPHMASTLFSFFP 
MRYXiPLWCYKNKFLSYRKXSQCRMPSLFISGLSDQLIPPVMKKQ 
LYELSPSRTKRLAI FPDGTHNDTWQCQG YFTALEQF IKEWKSH 
SPEEMAKTSSKVTII 


5857 
5856 


1597 




KLIGKVLVX.SWADAMAAFAVEPQGPALGSEPMMLGSPTSPKPG 
VNAQFLPGFLMGDLPAPVTPQPRSISGPSVGVMEMRSPLLAGGS 
PPQPWPAHKDKSGAPPVRSIYDDISSPGLGSTPLTSRRQPNIS 
VMQSPLVGVTSrPGTGQSMFSPAS 1 GQFRKTTLS PAQLDP FYTQ 
GDSLTSEDHNLDDSWGDCIWGFLKASAXSYILLVQFAQYGGIS* 
NMWMSNTGNWMH I R YQS KLQARKALS KDGR I FGE SIK IG VKPCI 
DKS VMESSDRCA1.SS PSLAFTPP I KTLGTPTQPGSTPRISTMRP 
LATAYKASTSD YQ VI S DRQTPKKDE SL VSKAME YMFG W 




355 


1419 


PPHQPAAASTSXHQQQQPPPPPQJDSSKPWAQGPGPAPGVGSAP 

PAS SSAPPATPPTSGAPPGSGPGPTPTPPPAVTSAPPGAPP PTP 

PSSGVPTTPPQAGGPPPPPAAVPGPGPGPKQGPGPGGPKGGKMP 

GGPKPGGGPGLSTPGGHPKPPHRGGGEPRGGRQHHPPYHQQHHQ 

GPPPGGPGGRSEEKISGPRRGFKANLSLLRRPGEKTYTQRCRFC 

LLGIYLLISRRMNSRRLFAKIWENQEKFLSTKAKDSEFIKLESR 

ALA+UCPKPELG*YTP*GGRQLPSSLFPTHACLPLSCSVIFSPF 

KFPQ*NCWGRKPFRPNLGPHLKGAVCNRWDDPWEGPTGKGHCLN 
FAS 


5859 


307 


1503 


GGSSAR PRAS S RRMIjSRKKTKNE VS KPAE VQGKYVKKETS PLLR 

NLMPS FI RHGPTI PRRTDI CLPDSS PNAFSTSGDG WSRNQS FL 

RTP IQRTPHE IMRRESNRLS APSYIiARS IADVPRE YGSSQS FVT 

EVSFAVENGDSGSRYYYSDNFFDGQRKRPLGDRAHEDYRYYEYN 

HDLFQRMPQNQGRHASGIGRVAATSLGNLTNHGSEDLPLPPGWS 

VDWTMRGRKYYIDHNTNTTHWSHPLEREGLPPGWERVESSEFGT 

YYVDHTNKKAQY\RHPCAPTCTSV*ST"SCHI/AS/RQQTERNQ 

SLLVPANPYHTAEIPDMLQVYARAPVKYDHIIiKWELFQLADLDT 

YQGMLKLLFMKELEQIVKMYEAYRQALLTELENRKORQQWYAQQ 
HGKNF 


5860 


2956 


1270 


TIRVEEFPLCPGGGKAQLSSASLLGAGLLLQPPTPPPLLLLLFP 
LLLFSRLCGALAGPirVEPHVTAVWGKNVSLKCLIEVNETITQI 
SWEKIHGKSSQTVAVHHPQYGFSVQGEYQGRVLFKNYSLNDATI 
TLHNIG FSDSGKY I CKA VTFPLGNAQSSTT VTVL VEPTVSL IKG 
PDSLIDGGHETVAAXCIAATGKPVAHIDWEGDLGEMESTTTSFP 
NETATIISQYKLFPTRFARGRRITCWKHPALEKDIRYSFILDI 
QYAPEVSVTGYDGNWEVGRKGVNLKCKADANPPPFKSVWSRLDG 
QWPDGIiIASDNTLHFVHPLTFNYSGVYICKVT \NS PGS KEVTQK 
VHPTFQDPSLPTYPPLPALQFQWASPSTA*TSRO\IiATEP*KlA 
PSPLSTL\ATXKGWTQLPTIIA* CSGVGALFIV\LVKCFGLGIF 
CYRRRRTFRGDYFAKNYIPPSDMQKESQIDVLQQDELDPYPDSV 
KKENKNPVNNLIRKDYLEEPEKTQWNNVENLNRFERPMDYYEDL 
KMGMKFVSDEHYDENEDDLVSHVDGSVISRREWYV 


5861 


2051 


1305 

: 


EVCACVQAFWLVASSGDDSQGGDKCGCEVGSWVGSMRVVMARL'ri 
SEGEQGIPTACAAFAQQPAG/EPRRGI*AGVGEGGPQCSWVNYRC 
ri»EFLVSLLGTDLARGRGMSASGPTAPADS KQL/ML * D VHRRVT 
LiE*RMNSGSPARDNAPSQRFCTNLSEG1jRFGISPSWREA1iYGCH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G-Glycine, 
H=Histidine, I^Isoleucine, K^Lysine, 
L~Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V*= Valine, 
W=Tryptophan / Y=Tyrosine, x=Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 

"a ; — „ 


5862 


1556 


483 


PP FQL I MGiS I KVS PD YNWFRGT VP LKKI I VDDDDS KIWS L YDAG 
PRS IRCPIiI FLP PVSGTADVFFRQ ILALTGWGYRVIALQYPVYW 
DHLEFCDGFRKLLDHLQLD KVHLFGAS LGG FLAQKFAE YTHKSP 
RVHSL I LCNSFSDTS X FNQ TWTANS FWLMPAFMLKKT VLGJJFSS 
GPVDPMMADAIDFMVDRLESLGQSEIASRLTLNCQNSYVEPHKI 
RDI PVTIMDVFDQSALSTEAKEEMYKLYPNARRAHLICrGGHFPY 
LCRSAEVNLYVQIHL/R / RNS MEPNTR PLTHQVJS VPRS LRCRKA 

ALASARRSSSVSLAVNDEIiTRCVLV*SVASAPVSRPFPSGSSGS 
PVLTVSGK 


5863 


2714 


243 


PFPSRGSLPIiAAPREDTMGPLMVLFCLIjFLiYPGLADSAPSCPQN 
VNISGGTFTLSHGWAPGSLLTYSCPQGIjYPSPASRLCKSSGQWQ 

tpgatrslskavckpvrcpapvsfengiytprlgsypvggnvsf 
ecedgfi \lrgs pvrqcrpngmwdgetavcdngaghcpnpg isl 
gp\vrtgfrfghgdkvryrcssnlvltgsserecqgngvwsgte 

PICRQPYSYDFPEDVAPAIjGTSFSHMLGATNPTQKTKESI.GRKI 
Q IQRSGHLNL YLLLDCS QS VS ENDFL I FKESAS LM VDR I FS F3 1 
NVSVAI r TFASEPKVLMSVLNDNSRDMTEVISSLEMANYKDHSN 
GTGTNTYAALNS VYLMMNNQMRIjLGMETMAW\QE I RHAI ILL\T 
DG K\ S HMGGS P KTAVDH I RE I LN INQ KRNDYLDI YAIGVGKLDV 
DWRELN2LGSKKDGERHAFILQDTKALHQVFEHMLDVSKLTDTI 
CGVGNMSANASDQERTPWHVTIKPKSQET\C\RGAL1SDQWVLT 
AAHCFRDGNDHSLWRVNVGDPKSQWGKEFLIEKAVISPGFDVFA 
KJCNQGIL\EFYGD\DIALL\KLAQKVKM\STHCO^PSCXiP\CTM 
\EANLGFLRETFKGSTCR\DHENEL/VWNKQSV\PAHF\VAJb\N 
GSKLEHLTLRMGVE WTS CCRGI»S PKKKTM \FPNLT \,D VRE \ WT 
DNQFL\CS\GPQEDESP\CK*E\SGGA\VFXERHFJ<LSAGGVWC 
SWGL\YNP\CT.GSA\DKNSPKKGPSVAKVPPPTR/DFHIN\I 1 FP 

Q*SPWLRQHPGGMS*XFLPLLANGHLSPFACPARICRPLKFLPS 
EWATLRTL 


5864 


173 


1013 


PJCISVPQSLISLPQPLLCFPGGQEPSAPS PCLYSPLWACS FTMG 
IOiPPSIPPSSPIiACVLKNI*KPLQLTPDI*KPKCIiIFFCNTAWPQY 
KLDNDSK* PENGT FE FS I LQ VLDNS CHKMGKWSE VPDVQAFF \ S 
HWSLPSLCSQC/GLIPNLSSFSPFCSFG/PPPQVPSP/TESFFS 
MDSSDLPPSPQAAPRQAEPGPMSHLASAPPPVNPFITSPPHTWS 
SLQFHSVTSPPPPAQQFTLKKVAGAKGIVKVSAPFSLSQIR*RL 
GSFSSNIKIQPSSWLIWQQP 


5865 


568 


1684 


CLPGPRWGEGWRAGHTI VGCI FFKTAI I SHFKGGM YLCVCMCTC 
LSVCVCVQVGSWICV/CVSMCACVSLCTC\ICRCISMYTREHAC 
ACTRV*VYMC>IS/VCTCVSTCIDVRVCAHVCVYMCLCLGYA*AC 
TCV*MCVO0HEHVCMC/VCACSCVLL/CRGHICM/MCMSAYICI 
/CVY VCVLCVWACMRMSTCVWLVYG *ACTCVWMHM/ CSCTCR / C 
VHVCCMSMHACECLCVYLHI CGCAGTRRWWAGSARGSRS CS Rt, p 
C WAPGPG LS LPGPS CPS VEQG LGGG PGQLQGRS GE ARLGEHRG W 
GSPAAVCSRNCTVS PRRGADC PE APDVP KQPPGWGRAS FEERGC 
GGRGWVCAPPLKTGPQCCTFSIKPELKAKKKK 


586S " 


98 


3197 


ARPEVPAP PAWLS RRGAAKMGDKKDDKD4) PKKNKGKERRDLDIJIj 
KKEVAMTEHKMS VEE VCRKYNTDC VQGIiTHSKAQE I LARDG PNA 
LTPPPTTPEWVKFCRQLFGGFSILLWIGAILCFLAYGIQAGTED 
DPSGD^YL^IVLAAVVTITGCFSYYQEAKSSKIMESFKNMVPQ 
QALVIREGE KMQ VNAEE VWGDIjVEI KGGDRVPADLR 1 1 SAHGC 
KVDNSSLTGESEPQTRSPDCTHE\NPLKTRNITFFSNNFVEGTA 
RGVWATGDRTVMGRIATIiASGLEVGKTPIAIElEHFIQLITGV 

AVFIXjVSFFI lsli lgytwleavi fligi I vanvpegliatvtv 

CIiTIiTAKRMARKN CltVKNLEAVE TliGS'f S T I CS DKTGTLTQNRM 
TVAHMWFDNQIHEADTTEDQSGTS FDKSS HTWVALF * H/LIiGFC 
NRPVFKGGQDNI P VLKRD VAGDAS ES ALLKC I EL SSGS VKLMRE 
RNKKVAEIPFNSTNKYQLSIHETEDPNDNRYLLVMKGAPERILD 
RCSTILIiQGKEQPLDEEMKEAFQNAYIiEI.GGLGERVLGFCHYYL 
PEEQFPKGFAFDCDDVNFTTDNLCFVGLMSMIGPPRAAVPDAVG 
KCRSAGIKVIMVTGDHPITAKAIAKGVGI IFEGNETVEDIAARIi 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, .c= Cysteine, D*=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G-Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
J. = Leucine, M=Methioriine, N=Aeparagine, 
P= Proline, Q=Glutamine, R=»Arginine, 
S -Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 










WIPVSQVNPaDAKACVIHGTDLKDFTSEQIDEitQNHTEIVFAR" 
?S PQQKLI I VEGCQRQGAI VAVTGDGVNDS PALKKA0 IG VAMG I 
AGS DVS KQAADM I LLDDNFAS I VTGVEEGR L IFDNLKKS IAYTL 
TSNIPEITPFLLFIMANIPLPliGTITILCIDLGTDMVPAlSIiAY 
EAAESD I MKRQPRN P RTDKLVNERLI S MAYGQ IGM IQALGGFFS 
YFVILAENGFLPGNLVGIRLNWDDRTVNDLEDSYGQQWTYEQRK 
WE FTCHTAFFVS I VWQWADI* 1 1 CKTRRNSVFQQGMKNKIL I F 
GLF3ETALAAFLSYCPGMDVALRMYPLKPSWWFCAFPYSFLIFV 
YDEIRKLILRRNPGGWVEKETYY 


5867 


3 


1485 


LPGRRARGGRGLGWPPAQALDGSRMGKAKVPASKRAPSSPVAKP 
G P VKT LTRKKNKKKKRFW KSKARE VS KKP ASGPGAWR P ? KAP E 
DFSQNWKALQEWLLKQKSQAPEKPLVISQMGSKKKPKIIQQNKK 
ETSPQVKGEEMPAGKDQEASRGSVPSGSKMDRRAPVPRTKASGT 
EHNKKGTKERniGDIVPERGDIEHKKRKAK\GQPQPHPPR/lDI 
WFDDVDPADIEAAIGPEAAKIARKQLGQSEGSVSLSLVKEQAFG 
GLTRAIAI^CEMVGVGPKGEESMAARVSIVNQYGKCVYDKYVKP 
TEPVTDYRTAVSGIRPENLKQGEELEWQKEVAEMLKGRILVGH 
ALHNDLKVLFLDHPKKKIRDTQKYKPFKSQVKSGRPSLRLLSEK 
I LG LQ VQQAEH CS IQDAQAAMRL YVM VKKEWES MARDRRP LLTA 
PDHCS DDA* QS CPAAAAAPLQRQCDQSQGQ I TS PQS GNS GETFS 
ESWQRGVAWCY 




5868 


2122 


833 


LTAGASHTQDASQSTSAKYPAAAQNL/CVTl^AMREDLADIWYIR 
AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
TERSAFTERDAGSGLVTRLRERPALLVSSTSWTEDEDFSILLAA 
LESRV* T\MTI*DGHNL PS LVCV I TGKGPLREY YSRLIHQKHFQH 
IQVCTP WLEAED YPLLIX3 S ADLGVCLHTS S SGLDLPMKVVDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


S869 


2122 


833 


LTAGASHTQDASQSTSAKYPAAAQML/CVTNAMREDLADIWYIR 
AVTVYDXPAS FFKETPLDLQHRL FMKLGSMHS PFRARS E PE DPV 
TERSAFTERDAGSGLVTRLRERPALLVSSTSWTEDEDFSILLAA 
LESRV*T\MTLDGHNLPSLVCVITGKGPLREYYSRLIHQPCHFQH 
IQVCTPWLEAEDYPLLLGSADLGVCriHTSSSGLDLPMKWDMPG 
CCLPVCAVNFKCLHELVKHEENGtiVFEDSEELAAQLQMLFSNFP 
DPAG KLNQFRKNLRES QQLRWDES WVQTVLPLVMDT 


5870 


2122 


833 


LTAGASHTQDASQSTSAKYPAAAQNL/ cvtnamredladiwyir " 
AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
TERS AFTERDAG SGLVTRLRE RPALLVS S TSKTE DEDFS 1 LLAA 
LESRV* T\MTLDGHNLPSLVCVITGKGPLREY YSRLIHQKHFQH 
IQVCTPWLEAEDYPLLLGSADLGVCLHTSSSOLDLPMKWDMFG 
CCLPVCAVNFKCLHELVKHE ENGLVFBDS EBLAAQLQMLFSNFP 
DPAGKLNQFRKMLRESQQLRWDESWVQTVLPLVMDT 


5871 


3 


3465 

1 
< 


FFFCRPLRLYSKTTGDRSAMAGAAGLTAEVSMkVLERRARTKls - 
VLKLL*LSLRRL*LEPTX*NGLLT*CSRLSVFRFLKV\GSVYEP 
LKS INLPRPDNETLWDKLDHYYRI VKS TLLLYQSPTTGLFPTKT 
CGGDQ KAKI QDS L YCAAGAWALAIAYRR I DDDKGRTHELEHSAI 
KCMRGILYCYMRQADKVQQFKQDPRPTTCLHSVFNVHTGDELLS 
YEEYGHLQINAVSLYLLYLVEMISSGLQIIYNTDEVSFIQNLVF 
CV\ERVYRVP\DFG\VWGKREGKYY*/SGSTELHSSSVGLGKRQ 
L * KQFNGFNLFGNQGCSWSVI FVDLDAHNRNRQTLCS LLPRESR 

SHNTTlAR.tiTjPr t TCVD2i13 , ftT TYirnrr po^tt nvrnmvr yp/~(»»~, 

trj\i* nLtUUtii V L*c oy X JjJUa.V VKKLiKGKYGFKR 

FLRDGYRTSLEDPNRCYYKPAEIKLFDGIECEFPIFFLYMMIDG 
VFRGNPKQVQEYQDLLTPVLHHTTEGYPWPKYYYVPADFVEYE 
KNNPGSQKRFPSNCGRDQKLFLWGQALYIIAKIiIADELISPKDI 
D P VQR Y VPLKDQRNVS MRFSNQGPLENDLWHVALI AE SQRXiQ V 
FLNTYGIQTQTPQQVEPIQIWPQQELVKAYLQLGINEKLGLSGR 
PDR P I GCLGTS KI YR ILGKTWC Y PI I FDLS DF YMSQDVFLLID 
DIKNALQ FI KQ YWKMHGR PL FLVL IREDNIRGS RFNP ILDMLAA 
LKKG I IGGVKVHVDRLQTLISGAWEQLDFLRI SDTEELPEFKS 
FEELEPPKHS KVKRQS STPSAP ELGQQPDVNISEWKDKPTHEIL 
aKLNDCSCLASQAIIXGIIXKllEGPNFITKBGTVSDHIERVYRR 



391 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
WO: 


Predicted 
beginning 
nucleotide * 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, 
Glutamic Acid, F~ Phenylalanine, G=Glycine, 
H^Histidine, I=:Isoleucine, K=Lysine, 
L= Leucine, ^Methionine, N^Asparagine , 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T-Threonine, V^Valine, 
W«Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AGSQKLWSVVRRAASLLSKVVDSLAPSITNVL\/QGkQVTLGAFG 
KEEEVISNPJjSPRVXQNIIYYKCNTHDEREAVIQQELVIHIGWI 
I SNN PE LFSGTLKI R I GW I IHAME YE LQ I RGGDKPALDL YQLS P 
S E VKQLLLD I LQPQQNGRCWLNRRQ I DGSLNRTPTG F YDR VWQ I 
LERTPNGI 1 VAGKHLPQQ PTLS DMTM YEMN PS LLVEDTLGNIDQ 
PQYRQIVVELIiMWSIVLER^PELEFQDKVDLDRIiVKEAFNEFQ 
KDQSRLKEI EKQDDMTSFYNTPPLGKRGTCS YLTKAVMNLLLEG 
E VKPNNDDP CL I S 


5872 


68 


665 


VQGYMYRFVIKINSCYSEKTS I CRHRCCPELPATQPWPTPTVFF 
N I AIDS ESLG C I \ S F KL FADKV/ P KRWKKNFVLLNTG EXVLGD K 
GPCFYRIIPG\LCQGGDFTHHNGTGGKSLYSKEFDDENFI/IjKH 
TAPGVLSTANAG PTTNGS Q FF I CTAKTEDG * QHWFGKVKDGMS 
XVEALERSGSRNGKTSKKITAANCGQL 


5B73 


2240 


506 


RRPPEGGSGGGRRTRARMPLPWSIiALPIiLLSWVAGGFGNAASAR 
HHGLLASARQPGVCHYGTKLACCYGWRRNSKGVCEATCEPGCKF 
GEC VGPNKCRCF PG YTGKTCSQDVNE CGMKPRP CQHRC VNTHGS 
YKCFCLSGHMLMPDATCVNSRTCAMINCQYSCEDTEEGPQCLCP 
S SGBRLAPNGRD CLDIDE CASG KVI C P YNRRC VNTFGS Y YCKCH 
IG PE LQ YI SGR YDCI D INECTMDSHTCSHHANC FNTQGS FKCKC 
KQG YKGNGLRCSA I PENS VKEVLRAPGTIKDRI KKLLAHKNSMK 
KKAKIKNVTPEPTRTPTPKVNLQPFNYEEIVSRGGNSHGG\KKG 
NEEKMKEGLEDEKREEKALKD*HRRERPFRG\DVFFPKVNEAGE 

fglilwqrkaltsklehkadlni s vdcsfkhg \ i cdw\ kqdr\ 
eddfdw\npadr\dnai\gfy\mavpglwqghk\kdigrlklll 

PDLQPQSNFCLLFDYRIAG1>KVGKLRVFVKNSNNALAWEKTTSE 
DEKKKTGKIQLYQGTDATKS 1 1 FEAERGKGKTGEIAVDGVLLVS 
GLCPDSLLSVDD 


5874 


2 


3387 


ACPRLARRRRRVRSL^RRGWLRARWSRGQNKMAARRITQETFD 
AVLQE KAKRYHMDASGEAVS ETTiQFKAQDLLRAVPRS RAEMYD D 
VHSDGRYSLSGSVAHSRDAGRESLRSDVFSGPSFRSSNPSISDD 
S YFRKE CGRDLE FSHS NSRDQ VIGHRKLGHFR SQDWKFALRGS W 
EQDFGH P VSQES S WS QE YS FGP S AVLGDFGSSRL I E KECLEKE \ 
SRDYDVDHSG\EA\DSVLRGS\SQVQA\RGRALNIVDQEGSLLG 
. KGETQGLLTAKGG VGKDVTLRNVSTKKI PTVNR I TPKTQGTNQ I 
QKNTPS PDVTLGTNPGTED IQFPXQKI PLGLDLKNLRLPRRKMS 
FDIIDKSDVFSRFGI E 1 1 KWAGFHT 1 KDD I KFSQLFQTLFELET 
ETCAKMLAS FKCSLKPEHRD FCFFT IKFL KHSALKTPRVDIf EFL 
NMLLDKGAVKTKWCFFEIIKPFDKYIMRLQDRLLKSVTPLLMAC 
NAYELSVmKTLSNPLDLALALErTNSLCRKSLALLGQTFSLAS 
SFRQEKIL*AVGLQDIAPSPAAFPNFEDSTLFGREYIDHLKAWL 
VfiSGCPLQVKKAEPEPMREEEKMI PPTKPE IQAKAP SSLSDAVP 
QRADHRWGTIDQLVKRVIEGSLSPKERTLLKEDPAYWFLSDEN 
SLEYKYYKLKLAEMQRMSENLRGADQKPTSADCAVRAr-lIiYSRAV 
RNLKKKLL P \ WQRRGLLRAQG\ LRG \ W KARRA\TTGTQTLLFLR 
APGLKHKGRQAPGLS\QAKPSLPDRND\AAKD\CPLDPV\GFSP 
QDPSLE ASGPSPKPAGVDI SEAPQTSS PCPSADIDMKDNGRTAE 
KLARFVAQVG\PEIEQF\SI\ENSTDNPDLWFL\HDQNSS\AFK 
FY\RKKVFELCPSICFTSSPHNIi\HTGGGDTT\GSQESPVDXiME 
GEAEFEDEPPPREAELES PEVMPEEEDEDDEDGGEEAPA\ PGRG 
GPSLBGSTPADGLPGEA\AEDDL/AliGAPAI>FTGLLQVTCFPFG 
RGFSSKSLKVGMIPAPKRVCLIQEPKVHEPVRIAYDRPRGRPMS 
KKKKPKDLD FAQQKL\ TDK\NLG FQ\MLQKMGWKEGHGLGS LG K 
GIR\SRSACTQQAAWGGSGWGLSPSTCSLPLGSFTAKMAYSWQL 
IFVF 


5875 


295 


1848 


LAALGGL PLWRLSRRG FRE Y LLG LS APSALGGAMRS VS Y VQRVA 
LEFSGSLFPHAICLGDVDNDTLNELWGDTSGKVSVyKNDDSRP 
WLTCSCQGMliTCVGVGDVCNKGKNLLVAVSAEGWFHLFDLTPAK 
VLDASGHHETLIGEEQRPVFKQHIPANTKVMLISDIDGDGCREL 
WG YTDRWRAFRWEELGEGP EHLTGQLVS LKKWMLEGQ VDS LS 
VTLGPLGLP ELM VSQ PGC AYAI LLCTWKKDTGS PPAS EGPTDGS 
/SGDPSCPRRGAAPDIWPYPQQECLHSPNWQHQT\SHGTESSGS 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

c or re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A-Alanine, (^Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L=lieucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
SaSerine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








GLFAriCTLDGTLKLMEEMEEADKLLWSVQVDHQLFAUSKtiDVTG - 
NGHEEWACAWDGQTYIXDHNRTWRFQVDENIRAFCAGLYACK 
EGRNS PCLVYVTFNQKI YVYWEVQLERMESTNLVKLLETXP \ST 
TACCRSWAWILTTSL*LVPCFTKRSTIQTSHHSVLPQASRIPPS 
KTCLIAGEGFF*TPTLPPKGVFGSHCAAAGSITKQ 


" $876 


1122 


224 


HLPLGVP S KVAGAAAME PQEERETQVAAWLKKI FGDHP I ?QYE V 
KPRTTEILHHLSERNRVRDRDVYLVIEDLKQKASEYESEAKYIiQ 
DLLMESVNFSPANLS STGSRYLWALVDSAVALETKDTSLASFI P 
AVNDLTSDLFRTKSKSEE I KI ELEKLE KNLTATLVLEKCLQEDV 
KKAELHLSTER\AKVDNRRQm^\DFLKAKSEEFRFGIQAAGEQL 
SARGQA DAFS VP IQSLVALIRBNWPRLKQQTI PLK\ KKLES YLD 
LMP\NPSHCSK*RIEEAK\REIA\SIEAELTRRVS\MMEIi 


SB77 


2030 


1907 


GTLGKMAASSSGEKEKERLGGGLGVAGGNSTRERLLSALEDLEV 
LSRELIEMLAISRNQKLIiQAGEEWQVIjELIilHRDGEFQELMKIiA 
LNQGKI HHEMQVLEKE VEKRDSDIQQLQKQLKEAEQI LATAVYQ 
AKEKLKSIEKARKGAISSEEI I KYAHRISASNAVCAPLTWVPGD 
PRRPYPTDLEMRSGLLGQMNNPSTNGVWGHIjPGDALA/RRKIAR 
CPCSTVS/NGSQMTCR*IN1ILILQKSVCBI. 


5878 


950 


2113 


GLWKCMQLQGPHTHRVQP* PTPRQQGPQ WPVAVIAGNRPNYLY 
RMLRS LLSAQG VSPQMI T VFIDG Y YEEPMDVVALFGLRGI QHTP 
ISIKNARVSQHYKASLTATFNLFPEAKFAWLEEDLDIAVDFFS 
FLSQS I HliLEEDDS IjYC I SAWNDQG YEHTAED PALLYRVETMPG 
LGWVLRRSLYKEELE PKWPTPEKLWDWDMWMRMPEQRRGRECI I 
PDVSRSYHFGTVGl^NMNGYFHEAYFKiaiKFWTVPGVQLRNVDSI, 
KKEAYEVEVHRLLSEAEVLDHSKNPCEDSFLPDTEGHTYVAFIR 
MEKDDDFTTWTQLAKCLHIWDLDVRGNHRGLvrRLFRKKNHFLW 
GVPASPYSVKKPPSVTPIFLEPPPKEEGAPGAPEQT 


5879 


3 


981 


RLTEAAAAGSGSRAAGWAGSPPTIiLPLSPTSpRCAATMASSDED 
GTNGGASEAGEDREAPGKRRRLGFIiATAWLTFYDIAMTAGWIjVL 
AIAMVRFYMEKGTHRGLYKSIQKTLKFFQTFALLEIVHCL1GIV 
PTSVI VTGVQVSSR I FM VWLITHS I KPIQNEESVVLFLVAWTVT 
EITRYS F YTFSLLDHLFYFI KWARYNFFI ILYPVGVAGELLT I Y 
AALPHVKKTGMFSIRLPNKYNVSFDYYYFLLITMASYIPLFPQL 
YFHMLRQRRKVLHG\G*L* KRMIK* SLQTR CFFQNNQD YLS PS F 
NNKNKQLCEISWIVWFLKI 


5880 


113 8 


1324 


S LWCLVAGGLGLGPSSQNPLQRAG ilarpreargtfsaltacsa 
svtskgksssgmwpsaasdrdspvplrppgpvqlpsgtgwvlsd 
♦kkkrgrcss/wlsqpqhereke^^ijirsmaegeraraasdvl 
crsianethqlrrtltatahmcqhlakclderqhaqrnvgersp 
dqsehtix3htsvqsviek1qeenrllkqkvthvedlnakwqryn 

ASRDEYVRGLHAQLRGLQIPHEPELKRKEISRIiWRQLEEKINDC 

aevkqelaasrtardaalervqmleqqilaykddfmseradrer 
aqsriqeleekvasllhqvswrqdsrepdagrihagsktakyla 
adalelmvpggwrpgtgscx3peppaegghpgaaqrgqgdlqcph 
clqcfsdeqgeellrhvabccq 


5881 


26 


441 


GGIHPSPTEAPRAQHLTMDCTWRILFIiVAAATGTHAQVQLLQSG " 
SEVKKPGAS VMVS CYVSG YTLTKLSMHWVRQAPGKGLE*MGP FD 
LQD VET I YPQKFQGRVSMTEETS TETTQ / A YLELSS LRSEDTAV 
HHCATDTV 


5882 


2407 


2216 


SGCVEMIiYSHSLEYNPEWISVQSAVAPAQIAUaSDGDL*LHSGE " 
RTRRD*QIiP3AGGPGLQEPLQLGEtiDITSDEFILDEVDG\VDLR 
HYS KQ VELELQQ IEQKS IRD Y I QESEN IAS LHNQ 2 TACDAVLER 
MEQMLGAFQSDLSS ISSE IRTLQEQSGAMHIRLRNRQAVRGKI*G 
ELVDGLWPSALVTAILEAPVTEPRFXEQLQELDAKAAAVREQE 
ARQTAACADVRGVLDRLRVKAVTKIREFILQKIYSFRKPMTNYQ 
I PQTALIiKYRFFYQFLIiGNERATAKElRDEYVETLS KI YLS YYR 
S YLGRLMKVQ YEEVAEKDDLMG VEDTAKKGFFS KPS LRSRNT I F 
TLGTRGS VI S PTELEAP I LVPHTAQRGEQRYPFEAL FRSQH YAL 
LDNS CRE YliF X CE FFVVSGPAAHDLFHAVMGRTLSMTLKHLDS Y 
IJ^CYDAIAVFLCIHIVLRFRNIAAKRDVPALDRYWEQVLAIiLW 



393 



WO 01/53312 



PCT/US00/34263 



SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, JULysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P^Proline, Q^Glutamine, R^Arginine, 
S=Serine, T=Threonxne, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=*possible nucleotide insertion) 








PRFEL I L.EMN VQS VRSTDPQRLGGLDTRPHVITRRYAEFSSAL V 
S INQTI PNERTMQLLGQLQVEVENFVLRVAAEFSSRKEQLVPLI 
NNY DMM LG VLM \ E * ERAADDS KEVES FQQLLNARTQEF I EELLS 
PPFGGLVAFVKEAEALI ERGQAERLRGEEARVTQL I RGFGSS WK 
SSVESLSQDVMRSFTNFRNGTSIIQGALTQLIQ\LYHRFHRV\L 
SQPQLRALPARAELINIHHLMVELKKHKPNF 


5883 


2 


1374 


EFPGRRFRAVMEAGAGAGAGAAGWSCPGPGPTVTTLGSYEASEG 
CERKKGQRWGSLERRGMQAMEGEVLLPALYEEEEEEEEEEEEVE 
EEEEQVQKGGSVGSLSVNKHRGLSLTETELEELRAQVLQLVAEIi 
EETRELAGQHEDDSLBU3GLLEDERLASAQQAEVFTKQIQQI.QG 
ELRSLREE I SLLBHEKESELKEI EQELHLAQAE I QSLRQAAEDS 
ATE HES DIAS LQEDLCRMQNELE DMER I RGDYEM E IAS LRAEME 
MKSS EPS GS LGLS D YS GLQEELQELRER YHFLNE E YRA LQES NS 
SLTGQLADLESERTQRATERWU3SQTLSMTSAESQTSEMDFLEP 
DPEMQLLRQQ IjRDAE EQMHGMKKfKCQEL CCELEELQHHRQVS E E 
EQRRLQREDKCAQNEVLRFQTSHS\SPSHPLPPIPPSSPCLL*A 
LWISALLWCWWAETSS 


5884 


4261 


2522 


GVLARAS ARLRVPLTGVRACAEPE VGAE PAKVAGAAE PDEDGGR 
SRLRDCGDYTPSERLGPKGAMIjWFCGAIPAAIATAKRSGAVFVV 
FVAGDDEQSTQMAASWEDDKVTEASSNSFVAIKIDTKSEACLQF 
SQIYPWCVPSSFFIGDSGIPLEVTAGSVSADELVTRIHKVRQM 
HLIiKSETSVANGSQSESSVSTPSASFEPNMTCENSQSRNAELCE 
IPSTSDTKSDTATGGESAGHATSSQEPSGCSDQRPAEDIiMIRVE 
RI.TKKr.EERREEKRKEEEQREIKKEIERRKTGKEMLDYKRKQEE 
ELTKRMLEERKREKAEDRAARERIKQQIALDRAERAARFAKTKE 
EVEAAKAAALLAKQABMEVKRESYARERSTVARIQFRLPDGSSF 
TNQFPSDAPLEEARQFAAQTVGNTYGNFSLATMFPRREFTKEDY 
KKKLLDLELAPSASWL1jP/AI.FINF*AGRPTASIVHSSSGDIW 
TLLGTVLYPFLAIMRL1SNFLFSNPPPTQTSVRVTSSEPPNPAS 
SSKSEKREPVRKRVLEKRG0DFKKEGKIYRLRTQDDGEDENNTW 
NGNSTQQM 


5885 


900 


467 


AAGGGRRSRLSRSWPTGPSKSPSGVRCCG\RR\AWEDKDEFLDV 
XYWFRQIIAWLGVIWGVLPLRGFLGIAGFCLINAGVLYLYFSN 
YLQIDEEEYGGTWELTKEGFMTSFA/IVHGHLDHLLHCHPL*LM 
VY5SQVLPIQSKGPS 


saas 


as 


1341 


PFRGRALTLKKQPRPGVAPPSLGTCHKSDPGRPAAQSQPPSPGS 
GTFGLLSFRMVRTKTWTLKKHFVGYPTNSDFELKTSELPPLKNG 
EVLIjEAIiFLTVDPYMRVAAKRLKEGDTMMGQQVAKVVESKNVAL 
PKGTIVIiAS PGWTTHS I SDGKDLEKLLTEWPDT I PLSLALGTVG 

mpgltaypglleicgvkggetvmvnaaagavgswgqiaklkgc 
kwgavgsdekvaylqkw5fdwfnyktveslbetlkkaspdgy 
dcyfdnvggefsntvigqmkkfgriaicgaistynrtgflppgp 

PPEIGIYQELRMEAFWYRWQGDARQKAIjKDliliKWVLELPYFVI 
D*LQANTLVYKSMKSAKPSLEYISEKLVSG\KIQYKEYIIEGFE 
NMPAAFMGMLKGDNLGKT I VKA 


5B87 


193 7 


104 


APGCRGCRATRCPCRGPRWDSIiGDEAARSPAAPGGAPGLLGLRE 
R PDRCHPGGDDRGPQLHRGSPG/S PS ELS RRPG P PGL PGLQGP P 
PAPGLPQSRTL/PVLCVCDLSPAQCDIITCCCDPDCSSVDFSVFS 
ACS VPWTGDSQFCSQKAVI YSIihFFTANPPQRVFELVDQINP S I 
FCIHITN\*Nt*HYPIiIjIQKYl*/NENNFDTLMKTSDGFTLWAESY 
VSFTTKLDI PTAAKYEYGVPLQTSDSFIiRF PSSLTS SLCTDNNP 
AAF LVNQAVKCTRKI NJbEQCE E I E ALSMAFYSSPB I LRVPDSRK 
KVPITVQS I VI QSLNKTLTRREDTDVLQPTLVNAGHFSLCVNW 
LE VKYS LT YTDAGE VTKADLS FVLGTVSS VWPLQQKFE IHFLQ 
ENTQPVPLSGNPGYWGLPLAAGFQPHKGSGIIQTTNRYGQLTI 
LHSTTEQDCLALEGVRrPVLFGYTMQSGCKIiRLTGALPCQLVAQ 
KVKSLLWGQGFPDYVAPFGNSQGP/ADMLDWVPIHFITQSFNRX 
DSCQLPGALVIEVKVITKYGSLLNPQAKIVKVTANLISSSFPEAN 
SGNERTILISTAVTFVDVSAPAEAGFRAPPAXNARLPFNFFFPF 
V 
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SEC 
ID 
NO; 


Predicted 
beg inn lag 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid* 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C= Cysteine, D=Aspartic Acid, B=< 
Glutamic Acid, F=Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K*= Lysine, 
L=Leucine f M=Methionine, N=?Aeparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5888 


375 


2302 


LLCRTPGVAMQRADSEQPSKRPRCDDSPRTPSNTPSAEADWSPG 
LELHPDYKTWGPEQVCSFLRRGGFEEPVLLKNIRENEITGALLP 
CLDESRPENLG VS S LGERKKLLS Y I QRLVQ I HVDTMKV I NDP I H 
GHIELHPLLVRIIDTPQFQRLRYIKQLGGGYYVFPGASHNRFEH 

PFSHMFDGRFIPLARPEVKWTKEC3GSVMMFEHLINSNG I KPVME 
QYGLIPEEDICFIKEQIVGPLESPVEDSLWPYKGRPENKSFLYE 
I VSNKRNG I DVD KWD YFARDCHHLG I QNNFDYKRF I KFARVCE V 
DNELR I CARDKE VGNLYDMFHTRNSLHRRAYQHKVGNI I DTMIT 
DAFLKADDY IE I TGAGGKKYRIS TAIDDMEAYTKLTDNI FLEIL 
YSTDPKLKDAREILKQIEYRNLFKYVGETQPTGQIKIKREDYES 
LPKEVASAKPKVLIiDVKIjKAEDFlVDVIKMDYGMQBKNPIDHVS 
FYCKTAPNRAIRITKNQVSQLLP\EKFAEQ\LIRVYCKKVDRKS 
LYA\ARQYFVQW\CADR\NFT\KPQDGRCY*PPTP*HPQKKGW\ 
NDSTFSPKIPTRLPRRLPKSRV\QLFXDDPM 


5889 


1831 


731 


LPAACGRPVTARPRQAPEGRSGRPRDI/DPYPPQVFPPRPDRVAI 
VTGGTDGIGYSTAKHLARLGMHVIIAGNNDSKAKQWSKIKEET 
LNDKET*VLLCCPGWLCIiWNSSDPPTSASRGAGTTGVHHHFIjLK 
FGIFIt»\DLASMTSIRQFVQKFKMKXI?I*HVLINNAGVMWVPQR 
KTRDGFEEHFGLNYLGHFLLTNLLLDTLKESGSPGHSARWTVS 
SATHYVAE LNMDDLQS S ACYS PHAAYAQSKLALVLFTYHLQRt»I> 
AAEGSHVTANWDPGVWTDLYKHVFWATRLAKKLLGWLLFKTP 
DEGAWTSIYAAVTPELEGVGGRYLYNKKETKSLHVTYNQKLQQQ 
LWSKSCEMTGVLDVTL 


5890 


1322 


200 


FRRGWSAAGRAVPVAFCSRISASSPRRPRGAVRLQSGTEAACRS 
GRPDPRPASAAGGHAGERMSQRDTLVHLFAGGCGGTVGAILTCP 
LEWKTRLQSSSVTLYISEVQLNTMAGASVNRWSPGPLHCLKV 
ILEKEGPRSLFRGLGPNIiVGVAPSRAIYFAAYSNCKEKLNDVFD 
PDSTQVHMISAAMAGFTAITATNPIWLIKTRLQL* /SQGTAGKR 
RMGAFECVRKVYQTDGLKGFYRGMSASYAGISETVIHFVIYES1 
KQKLLEYKTASTMENDEESVKEASDFVGMMIiAAATSKVLVATTr 
AYP HE WR TRLREEGTKYRS F FQTLS LLVQEEG YGSLYRGLTTH 
LVRQ I P \NTAIMMAT YELWYLLNG 


5891 


1322 


200 


FRRGWSAAGRAVP VAFCSR I SASSPRRPRGAVRLQSGTEAACRS 
GRPDPRPASAAGGHAGERMSQRDTLVHLFAGGCGGTVGAIT,TCP 
LEWKTRI^SSSVlXYISEVQIJriWAGASVNRVVSPGPLHCLKV 
ILEKEGPRSIiFRGLGPNLVGVAPSRAIYFAAYSNCKEKLNDVFD 
PDSTQVHMI S AAMAGFTAI TATNPIWLI KTRLQL * / SQGTAGKR 
RMGAFECVRKVYQTDGIiKGFYRGMSASYAGISETVIHFVIYESI 
KQKLLB YKTASTMENDE ES VKEAS DF VGMM LAAATS K\ L VATT I 
AYPHBVVRTRLREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTTH 
L VRQ IP \NTAI MMAT YE LWYLLNG 


CQQO 


1764 


379 


W LR VCGRLS VNSAVSS RTGG W SAGLTCAMQRLQVVLGHLRGPA 
DSGWKPQAAPCLSGAPHASAADWWHGRRTAICRAGRGGFKDT 
TPDEXjLSAVMTAVLBQ^VNLRPEQLGDICVGNVLQPGAGAIMARI 
AQFLSD I PET VPLSTVNRQ CS SGIiQAVAS IAGGIRNGS YDIGMA 
CG VESMSLADRGN PGNI TSRLMEKEKARDCL I PMG I TS ENVAER 
FGISREKQDTFALASQQKAARAQSKGCFQAEtVPVTTTVHDDKG 
TKRS ITVTQDEGIR P STTMEGLAJKLKPAFKKDGSTTAGNS SQVS 
DGAAAILLiARRS KAEELGLP I LGVLRS YAWGVP PD IMG I GPAY 

-r\ -r T5T fA T nV)\ /IT <n7CT>T77~> T CCTTsTT? \ TV PR C/"\7i B V^ 1 ! rPVT DT 

Al i> VAJjQ aAGJj 1 v&v vu 1 c a xN t, \Ar AbyAAi w Vii J^KJuFP * SG 
* TPLGGAS GP * GHPLGLHWGHVQ VITLAQ * S * SARGKRAYRSGC 
PCAIGSWNGSPLPVFEYPWGT 


5893 


3 


1653 


ILS KRRCQKAKTKELMAKKVAVIGAGVSGL I SLKCCVDEGLEP T 
CFERTEDIGGVWRFKENVEDGRAS I YQSVVTNTSKEMSCFSDFP 
MPEDFPNFLHNSKLLEYFRIFAKKFDLLKYIQFQTTVLSVRKCP 
DFSS SGQ WKVVTQSNGKEQS AVFDAVMVCSGHHI LPHI PLKS FP 
GMERFKGQYFHSRQYKHPDGFEGKRILVIGMGNLGSDIAVELSK 
NAAQVFISTRHGTWVMS R I SEDGYPWDS VFHTRFRSMLRNVLPR 
TAVKWMIEQQMNRWFNHENYGLBPQNKYI MKEP VLNDDVP SRLL 
CGAI KVKST VKE LTETSAI FEDGTVEENIDVI I FATGYSFSFPF | 
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SEQ 
ID 
WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
air.ino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, Ca Cysteine, D=Aspart±c Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
LaLeucine, M^Methionine, N=Asparagine, 
P- Proline, Q=Glut amine, R-Arginine, 
S=Serine, T=Threonine, V=Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEDS LVKVENNMVSLYKY I FPAHLDKS TLAC IGL I QP LGS I F P T 
AE LQARW VTRVFKGLCSLPSERTMMMDI IKRNEKRI DLFGESQS 
QTI^TNYVDZLDELALEXGAKPDFCSLLFKDPKLAVRLYFGPCN 
S Y* YRLVGPGQWEGARNAI FTQKQRI LKPLKTRALKDS SNFSVS 
FLLKILGLLAVWAFF\ CQLQWS 


5894 


174 


1673 


RYS PKKVLQNKES SIjKLGMATAIiVS AHS LAPLNLKKEGIiRWRE 
DHYSTWEQGFKLQGNSKGLGQEPLCKQFRQIiRYEETTGPREALS 
RLRELCQQWLQPBTHTKEHILELLVLEQFL 1 1 L PKELQARVQEH 
HPESREDVWVLEDLQLDLGETGQQVDPDQPKKQKILVEEMAPL 
KGVQEQQVRHECEVTKPEKEKGEETRIENGKLIWTDSCGRVES 
SGKI SEPMEAHNEGSNLERHQAKPKEKI EYKCSEREQR F IQHLD 
LIEHASTHTGKKLCESDVCX3SSSLTGHKKVLS*ERKVIQC\HGV 
LGKAFQRSSHLVRHQKIHIiGEKPYQCNECGKVFSQNAGLLEHLR 
IHTGEKP YLC IHCGKNFRRSSHl^NRHQR J HSQEEPCECKEOGKT 
FSQALLLTHHQRIHSHSKSHQCNECGKAFSLTSDLIRHHRIHTG 
E KPFKCNI CQKAFRLNSHLAQHVRI HNEEKPYQCSECGEAFRQR 
SGLFQHQRYHHKDKLA 


589S 


2967 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGB 
KRLFVSIX5\TGCLPVLAAAGRARGRAEVI*ISCVGPEDCVVPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 
EATELQPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHIDHSLS 
RQ \NCPFLAGETESLAD I VLWGALYPLLQDPA YLPE ELS ALHS W 
FQTLSTQ\EPCQR\AARRLVLKQ\QGVIjALR\PYL(QKQPQPSEA 
EGKGLSPIE PE3EELATLSEEE I AMAVTAWEKGLESLP PLRPQQ 
NPVLPVAGERNVLITSALPY VNNVPHLGN I IGCVLSADVFARYS 
RLRQtWTLYLCGTDE YGTATETKAL\EEGLTPQE ICDKYHI IHA 
DIY\RWFNISFDIFGRTTTPQQ\TKrT\QDIFQQOLKRGFVLQD 
TVEQLRCEHCARF\LADRFVEGVCPFCGYEEARGDQCI>KCGKLI 

PGSDWTPNAQFITFFFGFREWPSKPRWG*TRDLK\WGNPGTP*E 
GFEDK\VFYVWFDATIGYLSITANYTDQWERWW\KNPEQVDLYQ 
FM \ AKDNVPFKS LVFPS SALGAEDN YTL \ VSHU ATE YLN YEDG 
K\ FS KS RGVG VFRDM \AHDTG IP PD I S RFYL\L YI RPEG K\DS A 
FS WTDLLLKNNS \ ELLNNLGNF INRA\GM F VSKFFGG \ Y VPEMV 
LTPDDQRLLA\HVTLELQIIYIIQ\LLEKVRIRDALRSILTIS\RH 
GNQYI \ QVNE PW\ KR I KGSEADRQRAGTVTGLAVNI AALLSVML 
QP YMPTVSAT IQAQLQLPP PACS ILLTNFLCTLPAGHQIGTVSP 
LFQKLENDQI ESLRQRFGGGQAKTSPKPAWETVTTAKPQQ IQA 
LMDEVTKQGNIVRELKAQKADKNBVAASVAKLLDIjKKQLAVAEG 
KPPEAPKGKKKK 


5B96 




86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
MRLFVSDGVPGCLPVIAAAGRARGRAEVLISTVGPEDCVVPFLT 
RPKVPVIiQIiDSGNYLFSTSAICRYFFXLLSGWEQDDLTNQWIiEW 
EATE1»QPTLSAALYYL\VVQGKKG\EDVLGSVRRTLTHIDHSLS 
RQ\NCPFIiAGE TES LAD I VLWGALYPLLQDPAYLPEELS ALHS W 
FQTLS TQ\ E PCQR \ AARRL VLKQ\QGVLALR\ P YLQ KQPQPS PA 
B3KGLS P IEPE EEELATLS EEEI AMAVTAWEKGLES LPPLRPQO. 
KPVLPVAGERNVLITSALPYVNNVPHLGNIIGGVLSADVFARYS 
RLRQWNTL YLCGTDE YGTATET KAL \ EEGLTPQ E I CDKYH I IHA 
DIY\RW FNISFDI FGRTTTPQQ\TKI T\QDIFQQIiLKRGFVLQD 
TVEQLRCEHCARF\LADRFVEGVCPFCGYEEARGDQCDKCGKLI 
NAVEL KKPQCKVCRSCPWQSSQHLFLDLPKLEKRLEEWLGRTL 
PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 
GFEDK\VFYVWFDATIGYLS I TANYTDQWERWW\ KNPEQVDLYQ 
FM\AKDNVPFHSLVFPSSALGAEDNYTL\VSHLIATEYLNYEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FSWTDLLLKNNS\ELLNNLGNFINRA\GMFVSKFFGG\YVPEMV 
LTPDDQRLLA\HVT LELQHYHQ \LL E KVRI RDALRS I LT I S \ RH 
GNQY I \ QVNEP W\ KRI KGS EADRQRAGTVTGLAVN I AALLSVML 
Q P YM PTVSATI QAQLQLPP PACS ILLTUFLCTL PAGHQI GTVS P 
LFQKLENDQIESLRQRFGGGQAKTSPKPAWETVTTAKPQQIQA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A^Alanine, O Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
T.sLeucine, MsMethionine, N=Asparagine, 
■ P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T»Threonine, V^valine, 
W»Tryptophan, Y-Tyrosine, X=Un known, *=»stop 
Codon, /=possible nucleotide deletion, 
\ -possible nucleotide insertion) 








LMDE VT KQGNX VRE L KAQKAD KNE VAAE VAKLLDLKKQL AVAE G 
KPPEAPKGKKKK 


5897 


2967 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
MRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCWPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LI£GWEQDDLTNQWLEW 
BATELQ PTLS AAL YYL\ WQGKKG \ EDVLGS VRRTLTH I DHS LS 

rqVncpflagetesladivlwgalypllqdpaylpeelsalhsw 

FQTnSTQ\EPCQR\AARRLVLKQ\QGVLALR\PYLQKQPQPSPA 
EGKGLSPIEPEEEELATLSEEEIAMAVTAWEX0LBSLPPCJRPQQ 
NPVLPVAGERNVL ITSALPYVWNVPHLGNI IGCVLSADVFAR YS 
RLRQWICTLYLCGTDE YGTATETKAL\EEGLTPQEI CDKYHI IIIA 
D I Y \ RWFNI S FD I FGRTTTPQQ \ TKI T \ QD I FQQLLKRG FVLQD 
TVEQLRCEHCARF\LADRFVEGVCPFCGYEEARGDQCDKCGKLI 
NAVELKKPQCKVCRSCPWQSSQHLFLDLPKLEKRLEEWLGRTlj 
PGSDWTPNAQFITPFFGFREW PS KPRWQ *TRDLK\WGNPGTP * E " 
GFEDK\ VFYVWFDATIGYLS I TANYTDQWERWW\KNPEQVDL YQ 
FM\AKDNVPFHSLVFPSSALGAEDNYTL\VSHLIATEYLNYEDG 
K\ FSKSRGVG VFRDM \AHDTG I PPD I S R FYL \ LYIR PBGK\DSA 
FS WTDLLLKNNS \ ELLNNLGNF I NRA\GMFVS KFPGG \ YVPEMV 
LTPDDQRLLiA\HVTLELQHYHQ\LLEKVRIRDALRSILTIS\RH 
GNQYI \QVNEPW\ KRI KGSEADRQRAGTVTGLAVNIAALLSVML 
Q P YMPTVSATI QAQLQLPP PACS I LLTNFLCTLPAGHQIGT VSP 
LFQKLENDQIESLRQRFGGGQAKTSPKPAWETVTTAKPQQIQA 
LMD3VTKQGN.TVRELKAQKADKNEVAAEVAKTJ..DLKKQLAVAEG 
KPPEAPKGKKKK 


5898 


2967 


86 


HPS^LGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
MRLFVSDGVPGCLPVIJW^RARGRAEVLISWGPEDCVVPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDI.TWQWLEW 
EATELQPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHIDHSLS 
RQ\NCPFLAGETESLADIVLWGALYPLLQDPAYLPEEL$AI*HSW 
FQTIiSTQ \ E PCQR\AARRLVLKQ\ QG VLALR\ PYLQ KQ PQ PSPA 
EGKGLSP I EPEEEELATLSEE E I AMA VTAWEKGLESLPPLRPQQ 

npvlpvagernvlitsalpyvnnvphlgni igcvlsadvfarys 
rlrqwntlylcgtdeygtatetkal\eegltpqeicdkyhiiha 
diyVrwfnisfdifgrtttpqqNtkitXqdifqqllkrgfviiqd 

TVEQLR CEHCARF \ IiADRFVEGVCP FCG YEE ARGDQ CDKCGKL 1 
NAVELKKPQCKVCRSCPWQSSQttliFLDLPKLEKRLEEWLGRTL 
PGSDWTPNAQF1TPFFGFREWPSKPRWQ*TRDLK\WGWPGTP*E 
GFEDK\VFYVWFDATIGYLSITANYTDQWERWW\KNPEQVDLYQ 
FM \ AKDNVPFHSI»VF PS SAI/3AEDN YTL\ VS HL IATE YLN YEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FSWTDLLLKNNS\ ELLNNLGNF I NRA\GMFVS KFFGG \ YVPEMV 
LTPDDQRLLA\HVTLELQHYHQ\LLEKVRIRDALRSILTIS\RH 
GNQYI \QVNE P W\KR IKGSEADRQRAGTVTGLAVN IAALLSVML 
QP YMPTVSATI QAQ LQLP PPACS I LLTNFLCTLPAGKQ I GTVS P 
LFQKLENDQIESLRQRFGGGQAKTS pkpawetvttakpqqiqa 
LMDEVTKQGNIVRELKAQKADKNEVAAEVAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5899 


326 


1078 


NCPICSKEPNGVRAPSLPSPLRAAMALSDVDVKKQ1KHMMAFIEQ 
EANEKAEEIDAKAEEEFNIEKGRLVQTQRLKIMEYYEKKEKQIE 
QQKKI LMS TMRNQARL KVLRARNDL I S DLLS EAKLRLSR IVEDP 
EVYQGLLDKLVLQGLLRLLEPVMIVRCRP\QDLLLVEAAVQKAI 
PBYMTISQKHVEV\QIDKEA*LAVECSWEVWEVYSGKQRIKVSK 
TLESRLDLSAKQKMPE IRMALFGANTNRKFFI 


5900 


64 


14 09 


KAASRDS P CLE FCPLCG VS S HDLQHRM WYHRLSHLHSRLQDLLK 
GG V I Y PAL PQPNFKSLL PLAVHWHHTAS KSLTCAWQQHE DHFEL 
KYANTVMR FD YVWLRDHCRSAS C YNS KTHQRSLDTAS VDLC I KP 
KTIRLDETTLFFTWPDGHVTKYDLNHLVKNSYEGQKQKVXQPRI 
LWNAE I YQQAQVPS VDCQS FLETNEGLKKFLQNFLLYG I AFVEN 
VPPTQEHTEKLAERISLIRETIYGRMWYFTSDFSRGDTAYTKLA 
LDRHTDTT YFQEPCGI QVFHCLKHEGTGGRTLLVDGF YAAEQVL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

") q c.ri h i Crn 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A^Alanine, C=*Cysteine, D*=Aspartic Acid, E^ 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=sMe.thionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S*=Serine, T»= Threonine, V==Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QKAPEEFELLSKSAI \KHEYIEDVGECHQPHDWDWAQS* t STHG 
/ YKE LY L I RYNN YDRAV INTVP YDWHRW YTAHRTLT I ELRRP E 
NEFW VKL KPGRVLF IDNWR VLHGRECFTG YRQLCGCYLTRD DVL 
NTARLLGLQA 


5901 




2121 


VAIEQTSLKMMQAVGGAPARPTGEYICKQCGAKYTSLDSFQTHIj 
KTHLDTVLPKLTCPQCNKEFPNQESLLKHVTIHFMITSTYYICE 
SCDKQFTSVDDLQKHLLDMHTFVFFRCTLCQEVFDSKVSIQIjHL 
\AVKHSNEKKVYRCTSCNWDFRNETDI^LHVKHNHLENftGKVHK 
CI FOGES FGTEVELQCHI TTHSKKYNCKFCS KAFHAI I LLEKHL 
REKHCVFETKTPNCGTNGASEGVQKEEVELQTLLTNSQESHNSH 
DGSEEDVDTSEPMYGCDICGAAYTMETLLQNHQLRDHMIRPGES 
AIVKKKAELIKGNYKCNVCSRTFFSENGLREHMQTHLGPVKKYM 
CP I CGERFPSLLTLTEHJCVTHSKSLDTGNCR I CKMPLQSEEE Fh 
EHCQMKPDLRNSLTGFSCWCMQTVTSTLELKIHGTFHMQKTGN 
GSAVQTTGRGQHVQKIjYKCASCLKEFRSKQDLVKLDIMGtiPYGL 
CAGCVKIiSKSASPG INVPPGTNRPGLGQNEHLSA I EGKGKVGGL 
KTR CS * LATFKF*VLKVELPE PHPKPFHRGVS RPDSNSTQLKTP 
QVS PMPR I S PSQSDEKKT YQC I KCQMVF YNEWDIQ VH VANTHK I D 
EGLN HECKLCSQTFDS PAKLQCHL I EHS FEGMGGTFKCP VCFT V 
FVQANKLQQH I FS AHG QED K I YDCTQCPQKFF FQTE LQNHTMTQ 
KSS 


5902 


712 


209 


IiKNRRRSRPSIRQSIGSTSVSRWLTSLFTYIiDHTADVQ*V*REF 
I PLXPRQ* ED *MFQS WliHAWGDTLEEAFEQCAMAMFGYMTDTGT 
VEPLQTVEVETQGDDLQSLLFHFIjDEWLYKFSADEFFIPVGWGE 
EFSLSKHPQGTEVKAI TYSAMQVYNEENPEVFVI ID I 


5903 


2106 


735 


DTPGPSLPSTTAPFSLRSLSFPSRPSYLLPGDPQPLQGRGLPTT" 
P ALFALSAVPGGAAS PMPPSGLRLL P LI1LPI1LWLL VLTPGR PAA 
GLSTCKTIDMELVKRKRIEAIRGQIItSKLRLASPPSQGEVPPGP 
L P EAVtiALYNS TRDRVAGESAEPEPE PEADYYAKEVTRVXiM VET 
HNE I YDKFKQS THS I YM FFNTS ELRE AVPE P VLLSRAEIiRLLRL 
KLKVEQHVELYQKYSNNSWRYLSNRLLAPSDSPEWLSFDVTGW 
RQWLSRGGEIEGFRLSAHCSCDSRDNTLQVDINGFTTGR\RGDL 
ATIHGMNRPFLt*LMATPIiERAQHLQS\SRHRQAL\DTNY\CFSF 
HGGRNCLRC/ VHC*HL X FRKDL \GW \ KM I \HE \ P KG YHANFC\L 
GPCPYI WS LDTQYSKVliALYNQ\HKPG\ASAAP\CCVPQALEP \ 
IiPIVYY\VGRKPKVEQIjSNMIVRSCKCS 


5904 


3 


1126 


MMEEIENAIWTFKEEQRLIYEEIilKEEKTTNNELSAISRKIDTW 
AIjGNSETEKAFRAISSKVPVDKVTPSTLPEEVLDFEKFLQQTGG 
RQGAWDDYDHQNFVKVRWKHKGKPTFMEEVLEHLPGKTQDEVQQ 
HEKWYQKFIALEERKKES IQIWKTKKQQKREE IFKLKEKADNTP 
VLFHNKQEDNQKQKEEQRKKQKLAVEAWKKQKS I EMSMKCASQIi 
KEEEBKEKKHQKERQRQFKLKLLLESYTQQKKEQEEFLRLEKEI 
REKAEKAEKRKNAADE I SRFQERDLHKLEIiKI LDRQAKEDEKSQ 
KQRRIiAKLKEKVENNVSRDPSRIiY/NTHQRLGRTNQKDRTNRLW 
ATS T YP T* G YSWL ETRNTE KSMR 


5905 


287 


2912 


MAS FPPRVNEKE I VRLRTIGELLAPAAPFDKKCGRENWTVAFAP 
DGS YFAWSQGHRTVKLVPWSQCXQNFLriHGTKNVTNSSS LRLPR 
QNS DGG QKNKP REHI IDCGD I VWS IA FGSS VP EKQSRCVNI EWH 
RFRFGQDQLL tATGJLiNSGR I KI WDVYTGKLLLNLVDHTG VVRDL 
TFAPDGS LI LVSAS RDKTIiRVWDLRDDGN\MMKVLRGHQN WVY\ 
SCAFS PDSSMLCSVGASKAWAAILV* LRLCWHHSHTGATM VLS 
WAERVAS LATGLGATFTI G*SNLAFVLQGVLYVHRCWSMSTFCF 
SFFLFFFFKVISPTVKYH*LLSKLIFQFYGIGSLTSETNLM*SI 
WLSNG FS VL FFG ILSDSRDI LRL* FNLKF VL I FF * K* CIVS VQ K 
KKKP KR I ALLQEERLS *DKP PS SHLI * QTEVNI RX LFRAI LHS * 
LLI FRI* NCI * T YS * I IDPFYI QMTYDRG*FGKNKMVKF* FIEM 
* LYYFHXIAFS FCNW*HPCCLPKKFHLAVNILFACS ICFS S * A 
QVGDPSLL*TSDYLKGRCQWSNNLLTLRFLSVYFFKNLVVSGKK 
REGGL * YLTLF ISVYFS * LVFG1NGFQYS F WKLHCLYFMFRLI 
FFCLTFNRNI*NRICMSALINLKTDFNLTMTLSIFFKLLIIYNA* 
YNLN * I * QF * YKMCHFVI.CMS E * S YNI CLFIAGF \ LWNMDXYTM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=*Asparagine, 
PwProline, Q=Glutamine, R-Arginine, 
S^Serine, T=Threonine, VeValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








IRKLEGHHHDWACDFS PDGALLATAS YDTRVYIWDPHNGDILM 
EFGHLFPPPTPIFAGGANDRWVRSVSFSHDGLHVASLADDKMVR 
FWIDEDYPVQVAPLSNGLCCAFSTDGSVIiAAGTHDGSVYFWAT 
PRQVPSIiQHIiCRMSIRRVMPTQEVQELPIPSKLLEFLSYRI 


5906 


146 


2038 


REGAGSGRMASGA\YNPYIEIIEQPRQRGMRFRYKCECiR5AGSI 
PGEHSTDNNRTYPS I QI MNYYGKGKV\RITLVTK\NDP YKPHPH 
DLVGKDCRD\G YYEAEFGQE\ RRP \LFFQN \LGIRCVKKKEVKE 
A\IITR\IKAGINPFDVP*KQLNDIEDCDIiDWRLWFRVFLPDG 
HGNli \ TTAJbP?V\VSS PI YDNRAPNTAELRVCRVNKNCGS VRGG 
DE IFLLCDKVQKDDI EVRFVLNDWEAKGI FSQADVHRQVAI VFK 
TP P YCKAI TEP VTVKMQLRRPSDQE VS ES MDFR YIiPDE KDT YGN 
KAKKQKTTLLFQECLCQDHVETGFRHVDQDGLELLTSGDPPTLAS 
QSAG I TVNFPERPRPGLLGS IGEGRYFKKEPNLFSHDAWREMP 
TGVSSQAESYYPSPGPISSGLSHH^SMAPLPSSSWSSVAHPTPR 
SGNTNPLS S FSTRTL PSNSQG I PPFLRI PVGNDLNASNAC I YNN 
ADDIVGMEASSMPSADLYGISDPKM^SNCSVNMMTTSSDSMGET 
DNPRLLSMNLENPSCNSVLDPRDLRQLHQMSSSSMSAGANSNTT 
VFVSQSDAFEGSDFSCADNSMINESGPSNSTNPNSHVFVQDSQY 
SGIGSMQNEQLSDSFPYEFFQfV 


5907 


99 


1373 


TYLLSS WSS * * WLDTK I KSQVKV/ KKGHKJfc i S WP YPQPAKQNGK 
KATSKVPSAPHFVHPNDHANREAELKKKWVEEMREKQQAAREQE 
RQKRRTIESYCQDVLRRQEEFEHKEEVLQELNMFPQLDDEATRK 
AYYKEFRKWEYSDVILEVIiDARDPLGCRGPQMEEAVLRAQGNK 
KLVLVIiNKIDLVPKEWEKWLDYLRNE LPTVAFKASTQHQVKNIj 
NRCSVPVDQASESLLKSKACFGAENLMRVLGNYCRLGEVRTHIR 
VGWGLPNVGKSSIilNSLKRSRACSVGAVPGITKFMQEVYLDKF 
IRLIjDAPG IVPGPNS EVGTI LRN CVH VQKLADP VTP VET I LQRC 
NLEE I SNYYGVS GFQTTEH FLTAVAHRIiGKKKKGGIj YSQEQAAK 
AVIADWVSGKI SFYIPP PATHTLPTHLSAE 1 VKEMTEVFD I EDT 
EQANEDTMECLATGESDELLGDTDPI.EMEI KLLHSPMTKIADAI 
ENKTTVYK IGDLTG YCTNPNRHQMGWAKRNVDHR PKS NS MVDVC 
S VDRRS VLQR I METDPLQQGQALASAL KNKXKMQKRADKI AS KL 
SDSMMSALDLSGNADDGVGD 


5908 


247 


975 


HCG I KKRGEGSG S P S P ASG G FQLG CQ I P 3 PSLPS EEETH PHTRA 
HTRTLRATLTRRPPRSHSTRLRFPMPliDGEXfGLASWK/PMRER* 
GWRR PAKAAGAS LGVAATG KRGCRMS fCRYLQKATKGKLL 1 1 1 FI 
VTLWGKWSSANHHKAHHVKTGTCEWALHRCCNKNKIEERS QT 
VKCSCFPGQVAGTTRAAPSCVDASIVEQKWWCHMQPCLEGEECK 
VLPDRKGWSCSSGNKVKTTRVTH 


5909" 


1 


5002 


PAI PGSTI I WAPGSHSAARADGRHGSLPS C2SQAPGALCGARAPI> 
S SNLRADRSMI CAQARAGKNL YHNRFLGLAAMAF PSRNS QS LRR 
CKEPIRYSYNPDQFHNMDLRGGPHDGVTIPRSTSDTDLVTSDSR 
STLMGRSSYYSIGHSQDLVIHWDIKEEVDAGDWIGMYLIDEVLS 
ENFLD YKNRG VNGS H RGQ 1 1 WKI DASS YFVEPETKI CFKYYHGV 
SGALRATTPS VTVKNSAAPI FKS IGADETVQGQGSRRIiI SFSLS 
DFQAMGLKKGMFFNPDPYLKIS IQPGKHS I FPALPHHGQERRSK 
I IGNTVNP IWQAEQ FSFVSLPTDVLE IEVKDKFAKSRP I IKRFL 
GKLSMPVQRLLERHAIGDRWSYTLGRRLPTDHVSGQLQFRPEI 
TS S IHPDDE E I SLS TE P ES AQ I QDS PMNNLME S GSGE P RSEAPE 
S S ES WKP EQLGEGS VPDRPGNQS I ELS R P AEEAAV I TEAGDQGM 
VS VGPE GAGELLAQVQ KDI QPAP SAEELAEQLDU3E EAS ALLLE 
DGSAPASTKEEPLEEEATTQSRAGREEEEKEQEEEGDVSTLEQG 
EGREiQLRAS VKR KS R PCS LP VS ELETVI AS ACGDPETPRTHY I R 
2HTLLHSMPSAQGGSAAEEEDGAEEESTLKDSSEKDGLSEVDTV 
AADPSALEEDREEPEGATPGTAHPGHSGGHFPSLANGAAQDGDT 
HPSTGSESDSSPRQGGDHSCEGCDASCCSPSCYSSSCYSTSCYS 
SSC YSAS CYSPSCYNGNRFASHTRFSSVDS AKI SESTVFSSQDD 
EEEENSAFESVPDSMQSPELDPESTNGAGPWQDELAAPSGHVER 
SPEGLESPVAGPSNRREGECPILHNSQPVSQLPSLRPEHHHYPT 
IDE PLPPN WEARZDS HGRVF YVDHVNRTTTWQRPTAAATPDGMR 
RSGS I OQMEQLNRRYQN I QRT IATBRSEEDSGSQS CEQAPAGGG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence • 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide • 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S^Serine, T=Threonine, V« Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 








GGGGSD3EAESSQSSLDLRREGSLS PVNSQKITLLLQS PAVKF1 
TKPEFFTVLHANYSAyRVFTSSTCbjaiMILICVRRDARNFERYQH 
NRDLVNFINMFADTRLELPRGWEIKTDQQGKSFFVDHNSRATTF 
IDPRI PLQNGRLPNHLTHRQHLQRLRS YSAGEASE VSRNRGASL 

RQPSLARNHTLREKIHYIRTEGNHGLEKLSCDADLVILLSLFEE 
EIMSYVPLQAAFHPGYSFSPRCSPCSSPQNSPGLQRASARAPSP 
YRRDFEAKLRNF YRKLEAKGFGQGPGKIKLI IRRDHLLEGTFNQ 
VMAYSRKELQRNKLYVTFVGEEGLDYSGPSREFFFLIiSQELFNP 
YYGLFEYSANDTYTVQISPMSAFVENHLEWFRFSGRILG\liALI 
HQYLLDAFFT\RPFYKALL\RLPC\D\LSDLEYU)EEFHQSLQW 
MKDNNI TD I LDLTFTVNBE VFGQVTEREIjKSGGANTQVTE KNKK 
EYIERMVKWRVERGWCQTEALVRGFYEWDSRLVSVPDAREIxE 
LV I AGTAE IDLNDWRNNTE YRGG YHDGHLVI RWFWAAVERFNNE 
QRLRLLQFVTGTSSVPYEGFAAPPWEPMGLRRFLP+KKWGKITS 
LPPRG\HTCLQPDWDLPTVSPRTPMLYEK\LLTA\VEETSTFGT 


5910 


1526 


446 


VAEFAAMEPGRTQI KLDPRYTADLLE VLKTKYGI P S ACFSQPPT 
AAGLLRALGPVEIiAIjTSIIjTIiIiATjG*! IJVT PT.PTl2xWT 1 VlfM''rT.PD 

IKRRTLLWKSSAPTWSVLCCFGLWIPRSIjVLVEMTirSFYAVC 
FYLLMLVMVEGFGGKEAVLRTLRDTPMMVHTGPCCCCCPCCPRL 
LLTRKKLQ \R*CWAIjSNTPS * R* R* PWWACFSS PTASMTQQTFL 
RGAQLYGSTLSSA/CSTLLALWTLGI ISRQARLHLGEQtJMGAKF 
ALFGVLL I LTALOPS I FSVIiANOfiO T AfQ ppv<! <?tttp 

LLILETFLMTVLTRT4YYRRKDHKVGYETFSSPDLDLN1jKALRWM 
AWTMKGCCTH 


5911 


109 


595 


Q IjPIjAPC I QGKGLEMRS P KPQSFI I RS SHSGAGI»Ia VKN PSTP VF ' " 
CGHRRGGAAFKYKPTPWGPEQRPTGQKHMRGGVS LLS PRLECS 
GTISAHCNLRLPSSSNS PAPAS * LAGITGVCHHAQL1 FVFLVET 
GFHHVGQAGLELL/NWIHLPRPPKVLGLQA 


5312 


924 


277 


MILNKALMLGALALTTVMSPCGGEDIVADHVASYGVNLYQSYGP 
SGQYSHEFDGDEEPYVDLERKETVWQLPI»FRRFRRFDPQFAtTN 

TAUT.VHWTiMTUTTfDQMCiTa3iTMB'iroi7\7nnn?eireTiuirT r>nmTmT t 
inv ijxuiHiiiii j, vxxvit&«o inrti JMJSvrJCtV 1 VroKSPVTJbGQPNTIjI 

CL VDN I FP PWNl TWLS NGHS VTEGVS ETRPS S FKS DHFI iLQDQ 

VTSPSFPFE* * DL * TAKVEQLGAWFE PLIjKHWGAE I P TTI* 


5913 


46 


1198 


QLRMAGAEGAAGRQSEbEPWSLVDVLEEDEELENEACAVLGGS 
DSEKCSYSQGSVKRQALYACSTCTPEGBEPAGICLACSYECHGS 
HKLFELYTKRNFRCDCGNSKFKNLECKIiLPDECAKVNSGNKYNDN 
FFGLYCICKRPYPDPEDEIPDEMIQCWCEDWFHGRHLGAIPPE 
SGDFOEyVCOACMKRCSFriWAYAAntiAVTICT QT\fiMivmwrr!TT wi 

E * /DDQEVI KPENGEHQDS TLKEDVPEQGKDDVREVKVEQNSBP 
CAGSSSESDLQTVFKNESLNAESKSGCKLQELKAKQLIKKDTAT 
YWPIiNWRSKLCTC^DCMK^GDLDVLFLTDEYDTVLAYENKGKI 
AQATDRSDPLMDTLSSMNRVQQVEL1C/GIQ*FED 


5914 


960 


124 


NLGGS ELPPEEALFIQVASMNQRRVDFYIAS IEDMLVAI /GGRN 
ENGALSSVETYSPKTDSWSYVAGLPRFTYGHAGTIYKDFVYISG 
GHDYQIGPYRKNLLCYDHRTDWEBRRPMTTARGWHSMCSLGDS 
I YS IGGSDDNI ESMERFDVTjGVEAYS PQCNQWTRVAPLLHANSE 
SGVAVWEGR I Y I LGGYS WENTAFS KTVQ V YDREADKWSRGVDL P 
KAIAGGSACFIAP*SLGQRTRKRKAKARGTRTGASDPSCASWDH 
PHRHIi PGLCRPAATS 


5915 


1604 


703 


FPGRPTRPLKLGRRRKRARIIQAPHCIISFRPR7CPPGALQAPEA 
PASRAEGPVAVWNGHTEGPAPARSAPKEPPGLPRPLGSFPCPT 
PQEDFPALGGPCPPRMPPSPGFSAWLLKGTPPPPPPGLVPPIS 
KPPPGFSGLLPSPHP\PVSPAPPPPPPQK/RPRLLPAP/PGLPS 
PRELPGEEPSAHPVHQGLPAERRGPLQRVQEPLRGVQTGPDLRS 
PVLQELPGPAGGEFPEGL* *AAGPAAH 


. 5916 


256 


633 


SPRMWEIWGPWHRWESFSLEGEMPSRIPEPSPDSTKGTSGKGCR 
TVTG AVHRHLNHVAG I IPWVLHSQLKPTAATAQDQWTSQQYPDH 
FTRLILQ*NQATADKNN+TTALLQPHQRL\VSPRWAEA 


5917 


1343 


827 


AHQII»TYLEP/ ICLWNYNKIIjTVFLTKSVLEI *KFIHTPQTYR j 



400 



WO 01/53312 



PCT/US00/34263 



ID 
NO : 


Predicted, 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signai peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E=» 
Glutamic Acid, F« Phenylalanine, G^Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»«Asparagine, 
P=Proline, Q~Glut amine, R=Arginine, 
S=»Serine, T^Threonine, V= Valine, 
W tryptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








? * NDF FG I KE V YVS RRLR KTS F/ RLAVTFLEQAWS KECVP VDQ 
FMEHLLPSLLSLAS D PVPNVRVLLAKALRQMLLB KAY FRNAGNP 
HLE VIE ET I LALQSDRDQDVS FFAAIiE PKRRNI I DTAVLEKQN 


5918 


13 


1247 


EGAQ VARRR SRRQ WRAGRCGRGRGGR RAERTGGRG PPGRPR P L P 
PGPARRGRRRMETPFYGDEALSGLGGGASGSGGTFASPGRLFPG 
A?P TAAAGS MMKKDALTLSLS EQVAAALX PAPAPAS YppA\ADG 
APSAAPPDGLLASPDLGLLKLAS PELERIiI IQSNGLVTTTPTSS 
Q FLYPKVAAS EEQE FAEGF VKAItEDLKKQNQLGAGRAAAAAAAA 
AGGPSGTATGSAPFGEIAPAAAAPEAPVYA\NLSSY\AGGCRGL 
RGGAAT \ VAFAAEPVPFPPPPP PGALGPRRP /RLALQGRR PQTV 
PDVP \S FGES P\ PLS P I ET \DT P RR I \ KAKRKRL \RNPQ I RAP K 
PASRKLGAQSRALE R ES EDPS * S PEHGS LAS TAS UjREQVAQLJC 
QKVLSHVNSGCQLLPQHQVPAY 




1 


4254 


TSVQGDSQGTPTSSQGS INMEHWISQAIHGSTTSTTSSSSTQSG 
GSGAAHRLADVMAQTHI ENHSAPPDVTT YTSEHS I QVERPQGST 
GSRTAPKYGNAELMETGDGVPVSSRVSAKIQQIjVNTLKRPKRPP 
LREFFVDDFEELIiEVQQPDPNQPKPEGAQMLAMRGEQLGWTWW 
PPSbEAALQRWGTISP KAPCLTTMDTNGKPLY IIiTYGKLWTRSM 
KVAYS I LHKLGTKQEPM VRPGDRVAL V F P NTJD P AAFMAAFYGCL 
LAEVVPVPIEVPLTRKDAGSQQIGFLLGSCGVTVALTSDACHKG 
LPKS PTGE I PQFKX3 WPKLLWFVTES KHI>S KP PRDWF \ PH I KDAN 
NDTAYTEYKTCK\DGSVLGVTVTRTALLTHCQAI»TQACGYTEAE 
TIVNVLDFKKDVGLWHG ILTSVMNMMHVIS IPYSLMKVNPIjSWI 
QKVCQYKAKVACVKSRDMHWAIiVAHRDQRDIHLSSLRMLIVADG 
ANPWSISSCDAFLNVFQS KGLRQEVI CPCASSPEAIiTVAIRRPT 
DDSNQPPGRGVLSMHGLTYGVIRVDSEEKLSVLTVQDVGLVMPG 
AIMCSVKPDGVPQIiCRTDEIGELCVCAVATGTSYYGLSGMTKNT 
FEVFAMTSSGAPISEYPFIRTGLLGFVGPGGLVFWGKMDGLMV 
VSGRRHWADDIVATALAVEPMKFVYRGRIAVFSVTVLHDERIVI 
VAEQRPDSTEEDS FQWMSRVLQAIDS IHQVGVYCLALVPANTLP 
KTPLGGIHLSETKQLFLEGSLHPCNVLMCPHTCVTNLPKPRQKQ 
PE IGPAS VMVGNLVSG KR I AQASGRDLGQ I E DNDQARKFLFLSE 
VLQWRAQTTPDHILYTLLNCRGAIANSLTCVQLHKRAEKIAVML 
MBRGHLQDGDHVALVYP PG I DL I AAFYGCLYAGCVP I T VRP PHP 
QNIATTLPTVKMIVEVSRSAC^TTQLICKLLRSREAAAAVDVR 
TWPLILDTDD*PKKRPAQICKPCNPDTLAYLDFSVSTTGMLAGV 
KMSHAATSAFCRSIKLQCELYPSREVAICLDPYCGLGFVLWCLC 
SVYSGHQSIL1PPSELETNPALWLLAVSQYKVRDTFCSYSVMEL 
CTKGLGSQTESLKARGLDLSRVRTCWVAEERPRIALTQSFSKL 
FKDLGLHPRAVSTSFGCRVNIJUICLC^TSGPDPTTVYVDMRALR 
HDRVRLVERGS PHSLPLME SGKIL PGVR 1 1 IANP ETKGPLGDSH 
LGEIV7VHSAHNASGYFTIYGDESLQSDHFNSRLSFGDTQTIWAR 
TG YLG FLRRTELTDANGERHDALYVVGALDEAMELRGMR YHP ID 
IETSVIRAHKSVTECAVFTWTNLLWW2LDGSEQEALDLVPLV 
rKWLEEHYLIVGVVVVVDrGVIPINSRGEKQRMHLRDGFLADQ 
LDPIYVAYKM 


5920 


1381 


1499 


QLGAVAHAGVSRI PP+ LFP PLHPTFLSLWCLHHKLP / HPPGASM 
VRPPWPRRPPAHISSVRQASTQVPRTVPHTQRVAKTIGTQTTGP 
SGVGCC7 P GRPLIiPCKCS£> AAHSTYRVQE PAVHI PGQE PLTASM 
IiAAAPLHEQKQMIGERIjYPLIHDVHTQLAGKITGMLLEIDNSEL 
llmlru s p es lthak i deavav.lq ahqame qp kaymh 


5921 


727 


157 


VCPGTGGE * glwgqlgglpketplkpmdaftgsglkrkfddvdv" 

GSSVSNSDDE IS S SDSAD SCDS IiNPPTTAS FTPT5 1 LKRQKQLR 
RKNVRFDQVTVYYFARRC3GFTSVPSQGGSSLGMAQRHNSVRSYJ 
LCEFAQEQEVNHREILREHLKEEKLHAKKMKLTKNGTVESVEAD 
GIjTLDDVSDEDIDVENVEVDDYFFIjQPLPTKRRRALLRASGVHR 

idaeekqelrairlsreecgcdcrlycdpeacacsqagikcqvd 
rmsfpcgcsrdgcgnmagriefnpirvrthylhtimkleleskr 

Q\GAAQQPQ\*GALPDCQLQPDRSTGL*DPSWIGSKGLSFTGKG 
AAATHLI ILRVTENRGAEGKRX 


5922 


2475 


495 


SYSNWGI»FP5VFIQVPRSRTGWLKPIFIiFYSYYE\ CMBTXiKG \T 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residua of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{AsAlanine, C-Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=» Phenylalanine. G^Glycine, 
H-Histidine, I-Isoleucine, K=0»ysine, 
L=Leucine , M=Me thionine , N=Asparagine , 
P=Proline, Q=rf3lutamine , R=Arginine, 
S=Serine, T^Threonine, VeValine, 
W=Tryptophan, Y~Tyrbsine, X ^Unknown y *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








CL YNATQ YKVCS PRNDR PDACYN PS E PAATT VFEIRTGLLLGDT" 
SKI ITRTEEKE IPKQX TLRFDACAAJNS KKLEIGCGSLK* ERS* 
RVENKYVCHESGVCKNCAYWPCVI * AT* KKNKNDSVYLQKGEAN 
PS CAAGHCNPLELI 1 TNPLDPHWKXGER VTLG INRTGLKPQWT 
LI KGEVHKCSPKPVPQTPYEEI^PAPELLKKTKNLFLQIiABNV 
IFLLNGTS CYVRGGTTIGDRWPWEA* ELVPTD PAPD 1 1 PI * KAE 
ASNF* VLKTSI IRQYCIAREGKDFI I PVGKPNCIGQKLYNSTTK 
TIT**DLNHTEKNPFSKFSKLKTA*AHAESH+DWTVPSGLY*IC 
RHRAYFRIiPNKWADSCVIGTIKPS FFLLP I KMGELLGFS VYASR 
E KKGI VIGNWKDNEW P RERI IQY YG P ATWAQDG S WGYR / TP / VY 
MLN WI I RLQ AILE 1 1 SNE TGRALT VLAWQE TQ^RNA I YQNRLAL 
DYLLVAEGGVCRKFNLTNCCLQINDQGQWKNIVRDMTKLAHVP 
IQVWHKFDPESLFGKWFPAIGGFKTLIVGVLLVrRTCLLLPCVL 
PLLFQMI KG I VATLVHQKTSAHVKYMNHYRS ISQRDSKS EDESE 
NSH 


5923 


137 


63 8 


QLCGRRGQRFRTS I KRMHPI * RTCPNTOL/ 1 1 LLSQENTQI RDL 
QQENRELWISLEEHQDALELIMSKYRKQMLQIiMVAKKAVDAEPV 
LKAHQSHSAEIESQIDRICEMGEVMRKAVQVDDDQFCKIQEKIiA 
QLELENKELRELLS ISSESLQARKENSMDTASQAIK 


5924 


274 


2146 


EKGKVKDAGAEQWI SLSLS CKGS WETQFSNHLNSLTPPTS VRRM 
PLITTVTLLKM VARHHKKLLCSKAFS TQLQQKI FLHSQMG I HHQ 
SVCMKLKPNTSHIISILMGQPMALVQLETIAPLTIIIQKFQTQD 
HMKFW KNIiPLHSHHLTPSVPOTVI PKKTGSPRI VUC I TKT T CMS 
RELFESSLCGDLbNKVQASE\Q*NQSIESRKEKRKKSNKKDSSR 
S EERKSH KI PKLEPEEQNR PNERVDT VS EKPREEP VLBCEGS PS S 
ANTIFCSNNGSVHW\FKFQVGDLVWSKVGTYFWWPCMVSSDPQL 
EVHTKINTRGAFJSYHVQFFSNQPBRAWVHEXRVREYKGHKQYEE 
LLAEATKQASNHSEKQKIRKPRPQRERAQWDIGIAHAEKALKMT 
REER I EQ YTF I YI D KQPEE ALS QAKKS VASKTE VKKTRRPRS VL 
NTQPEQTNAGEVAS SLSSTE I RRHSQRRHTSAEEEEPPPVKI AW 
KTAAARKSLPAS ITMHKGSLDLQXCNMS P VVKI EQVFALQNATG 
DGKFI DQFVYSTKG I GNKTEI SVRGQDRLI ISTPNQRNEKPTQS 
VSSPEATSGSTGS VEKKQQRRS I RTRS ESEKSTE WPKKKI KKE 
QVGFLHVES 


5925 


216 


1911 


MMTAESPJEATGtiSPQAAQEKDGIVIVKVEEEDEEDHMWGQJDSTL 
ODTPPPDPE XPRORFRRPCYOMTFGPRHAljSRT.irPI.f'HrtMT.R P"R 
INTKEQ 1 LEDLVLEQFLSILPKEIiQVWLQE YRFDSGEEAVTLLE 
DI#E LDLSGQQVPGQVHGPEMLARGMVPLDP VQES S S FDLHHEAT 

qshfkhs srkprxo.qsralpaahi papphegsprdqamasalft 
ad sqamvk i sdmavsl i lee wgcqnlarrnls rdnrqenygs af 
pqggenrneneestskaetsbdsasrgettgrsqkefgekrdqe 
gktgerqqknpeektrkekrdsgpaigkdkktitgergprekgk 
glgrsfslssnfttpeevptgtkshrcdecgkcftrssslirhk 
1 ihtgekpyecsecgkaf\slns\nlvlhqri \htgekphecne 
cgkafshssnliluqrihsgekpyecnecgkafsqssdXltkhq 
rihtgekpyecsecgkafnrnsylilhrrvhtrekpykctkcgk 
\aftrsstiitlhhri hareraseys pas ldafgaflks c v 


5926 


2 


233 


DRCLMLKQGSQPGSPPAT/CEPPAPPVYQAPCQSCPEPPGAIIEP 
SDSPHHTPVHPPPEHSAACPAPATCCPP PRS SMS 


5927 


4146 


1248 


KHFS KFGS Q AL YQLKRP ASGQNS I SVMPAQKITKPAAKYGI PLA 
YKKYGDKKLHEKKPLQKHKQAHQTPEKRV1JTGEERRKISEEAAR 
KRRLEFIEKEKKQKDQI I SLMKAEQMKRQEKERLERINRAREQG 
WRNVLSAGGSGEVKAPFLGSGGTIAPSSFSSRGQYEHYHAI FDQ 
MQQQRAEDNEAKWKRE I YGRGLPERQ KGQLAVERAKQVE EFLQR 
KREAMQNKARAEGHMG I LQMLAAM YGGRP S SS RGGKPRNKE EEV 
YLARLRQI RLQNFNERQQ I KAKLRGEKKE ANHS EGQEGSEEADM 
RRKK\ lE^LKAHANARAAVLKEQLERKRKEAYEREKKVWEEHLV 
AKGVKSSD VS F PLGQHETGGSPS KQQMRSVXSVTSAIjKBVGVDS 
SLTDTRETSEEMQKTNNAISSKREILRRliNENLXAQEDEKGKQN 
LSDTFEUffVHEDAKEHEKEKSVSSDRKKWEAGGQIiVIPLDELTL 
DTS FSTTERHTVGEVI KLGPNGS PRRAWGKS PTDSVLKI LGEAE 



402 



WO 01/53312 PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C^Cyeteine, D=Aspartic Acid, B= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
HaHistidine, I»lsoleucine, K=I»ysine, 
L=Iieucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R^Arginine, 
S= Serine, T=Threonine , Vs= Valine, 
W=Tryptophan, Y=Tyrosine, X -Unknown, *^Stop 
Codon, /"possible nucleotide deletion, 
\»possible nucleotide insertion) 








LQLQTSLLENTTIRSEISPEGEKYKPLITGEKKVQCISHEINPS " 

AI VDS PVETKS P EFSEAS PQMSLKLEGKLEE PDDLETE IliQE PS 

GTNKDE\SLPCTITDVWISEEKETKETQSADRITIQENEVSEDG 

VSSTVDQLSDIHIEPGTNDSQHSKCDVDKSVQPEPFFHIC/VHSE 

HLNLVPQVQSVQCSPEESFAFRSHSHLPPKMKNKNSLLIGLSTG 

LFDANNPKMLRTCSLPDLSKLFRTLMDVPTVGDVRQDWLEIDEI 

EDENIKEQPSDSEDIVFEETOTDLQELQASMEQLLREQPGEEYS 

EEEESVLKNSDVEPTA1TGTDVADEDDNPSSESALNEEWHSDNSD 

GE I ASECECDSVFNHLEELRLKLEQEMGFEKFFEVYEKI KAIHE 

DEDENI E ICS KI VCNILGNEHQHLYAKI LHLVMADGAYQBDNDE 


5928 


4146 


1248 


KHFSKFGSQALYQLKRPASGQNS I S VMPAQK I TKPAAKYGI PLA 
YKKYGDKKLHEKKPLQKHKQAHQTPEKRVNTGEERRKISEEAAR 
KRRLEFIEKEKKQKDQIIStjMKAEQMKRQEKERLERINRAREQG 
WRNVLS AGGSGE VKAP FLGSGGTI AP SS FS S RGQ YEH YHAI FDQ 
MQQQRAEDNEAKWKR E I YGRGL PERQKGQLAVERAKQVE E FLQR 
KREAMQNKARAE GHMG I LQNLAAMYGGRPS S SRGGKP RNKEEE V 
Y LARLRQ IR LQNFNERQQI KAKLRGE KKEANHS EGQEGS EEADM 
RRKK\ IESLKAHANARAAVLKEQLERKRKEAYEREKKVWEEHLV 
AKG VKS S DVS P P LGQHETGGS PS KQQMRSVI S VTS ALKEVG VDS 
SLTDTRETSEEMQKTNNAI SS KRE I LRRLNENLKAQEDEKGKQN 

DTS FS TTERHT VGE VIKLGPNGS PRRAWGKS PTDS VDKI LGEAE 
I*QLQTELI*ENTT I RSE 1 S PEG EKYKP LI TGE KKVQCIS HE I NPS 
AIVDS PVETKS PEFSEAS PQMSI^KLEGNIiEEPDDLETE ILQEPS 
GTNKDE\SLPCTITDVWISEEKETKETQSADRITIQENEVSEDG 
VSSTVDQLSDIHIE PGTNDSQHS KCDVDKS VQPE PFFHKWHS E 
HLNLVPQVQSVQCSPEESFAFRSHSHLPPKNKNKNSLXiIGLSTG 
LPDAJINPKMLRTCSLPDLSKtjFRTLMDVPTVGDVRQDNLEIDEI 
EDENI KEGPSDS ED I VFEE TDTDI*QEIiQASMEQLLREQPGE E YS 
EEEESVLKNSDVE PTANGTDVADEDDNP S SE S ALNEE WHS DNSD 
GEIASECECDSVFNHLEELRLHLEQEMGFEKFFEVYEKIKAIHE 
DSDEXf IEICS KI VQWI LGNEHQHI.YAKI LHI*VMADGAYQEDNDE 


5929 


3 


1558 


IiDFSMTTQLPAYVAILLFYVSRASCQDTFTAAVYEHAAILPNAT 
LTPVS REEALAL MNRNLD I LEGAI TSAAD QGAH 1 1 VT PEDAI YG 
WNFKRDSLYPYLEDI PDPEVNW I PCNNRNRFGQTP VQSRLS CL\ 
AKNNS I Y WANI GDKKPCDTSD PQC P PDGR YQYNTD WF \DS QG 
KLVARYHKQNLFMGENQFNVPKEPE I VTFNTTFGSFGI FTCFD I 
ItFHDPAVTLVKD FHVDT I VFPTAWMNVltPHLS AVEFHSAWAMGM 
RVNFLASN IHYPSKKMTGSGI YAPNSSRAFHYDMKTEEGKLIiLS 
QLDSHPSHSAWNWTSYASSIEALSSGNKEFKGXVFFDEFTFVK 
LTGVAGNYTVC^IO)LCCHLSYKMSENIPNEVYALGAFDGLHTVE 
GR YYLQ I CTLLKCKTTNLNTCGDS AETAS TR FEMFS LS GTFGTQ 
YVFPEVLLSEMQLAPGEFQVSTDGRLFSLKPTSGPVLTVTLFGR 
LYEKDWASNASSGL7AQARIIMLIVIAPIVCSLSW 


5930 


113 


6082 


RGNCFWIVPFTMAQRTGLEDPERYLFVDRAVIYNPATQADATTAK - 
KLVWIPSERHGFEAAS I KEERGDEVMVELAENGKKAMVNKDDIQ 
KMNPPKFSKVEDMAELTCLNEAS VLHNLXDRYYSGL I YTYSGLF 
CWINPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISESAYRCM 
LQDREDQS I LCTGESGAGKTENTKKVIQ YLAHVASSHKGRKDHN 
IPGE \LERQU,QANP I LESFGNARTVQNDNS SRFGKF IRINFDV 
TGY I VGANI ETYLLEKSRAVRQAKDERTFHI FYQLLSG \AGEHIi 
KSDI»LLEGFNNYRFLSNGYIPIPGQ\QDKGNFRGDPGEAMHIMG 
FSHE EILSMLKWS S VLQFGNI SFKKERNTDQASMPBNTVAQKL 
CHLLGMNVME FTRA ILTPRI KVGRDYVQ KAQTKEQAD FAVEALA 
KATYERL FR WLVHR I NTCALDRTKRQGASF I G I LDIAGFE I FELN 
S FEQLCINYTNE KLQQLFNHTMFI LEQEE YQREGIEWNFIDFGL 
DliQPCIDLIERPANPPGVLALLDEECWFPKATDKTFVEKLVQEQ 
GSHSKFQKPRQLKDKADFCIIHYAGKVDYKADEWLMJCNMDPLND 
NVATLLHQSSDRFVAELWKDVDRIVGLDQVTGMTETAFGSAYKT 
KKGMFRTVGQLYKESIiTKLMATLRNTNPN FVRCI I PNHEKRAGK 1 
LDPHIiVIiDQLRCNGVIjEGI RI CRQGFPNR I VFQEFRQRYE I LTP | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corr e spond ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
EfeHistidine, I=Isoleucine, K=Lysine, 
L^Leuc ine , M«Me t hionine , N^Asparag i ne , 
P=Proline f Q=Glut amine, R=Arginine, 
S= Serine, T=Threonine, VWValine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NAIPKGFMDGKQACBRMIRALELDPNLYRIGQSKI FFRAGVLAH '" 

LEKERDLKITDI I IFFQAVCRGYlARKAFAKKQQQLSAIiKVLQR 

NCAAYLKLRHWQWWRVFTKVKPLLQVTRQEEELQAKDEELLKVK 

EKQTKVEGELEEMERKHQQLLEEKNILAEQLQAETELFAEAEEM 

RARLAAKKQELEEIXjHDLBSRVEEEEERNQILQNEKKKMQAHIQ 

DLEEQLDEEEGARQKLQLEKVTAEAKIKKMEEEILLLEDQNSKF 

1 KE KKLMEDR I AE CS SQIiAEEE E KAKNLAKI RNKQEVM I S DL EE 

RLKKEEKTRQELEKAKRKLDGETTDLQDQIAELQAQIDELKLQL 

AKK E EELQGAIARGDDETLHKNNAL XWRELQAQ XAELQED FES 

EKASRNKAEKQKRDLSEELEALKTELEDTLDTTAAQQELRTKRE 

QEVAEtiKKALEEETKNHEAQIQDMRQRHATALEELSEQLEQAKR 

F KANLE KNKQGLETDNKEIiACEVKVLQQ VKAES E HKRKKLDAQV 

QELHAKVSEGDRLRVELAEKASKLQNELDNVSTLLEEAEKKGIK 

FAKDAASLESQLQDTQELLQEETRQKLNLSSRIRQLEEEKMSLQ 

EQQEEEEEARKNLEKQVLALQSQIiADTKKKVDDDLGTIESLEEA 

KKKLL KDAE ALS QRLE EKALAYD KLEKTKNRLQQ ELDDLT VDLD 

HQRQ VASMLEKKQ \KKFDQLLAEEKS I SARYAEERDRAEAEARE 

KETKALS LARALEEALEAKEEFERQNKQLRADMEDLMSSKDDVG 

KNVHELEKSKRALEQQV\EEMRTQIjEEIjEDEI,QATEDAKIiRLEV 

nmqamkaqferdlqtrdeqneekkriil i kqvreleaelede rkq 
raiavaskkkmeidlkdi^aqieaankardevi kqlrklqaqmk 

D YQRE LEE ARAS RDE I FAQS KESEKKLKSLEAE I LQLQEE LAS S 
ERARRHAEQERDELADE ITNSASGKSALLDEKRRLEARIAQLEE 
EIjEEEQSNMEIiLNDRFRKTTLQVDTIiMAEIiAAERSAAQKSDNAR 
QQLERQNKELKAKLQELEGAVKBKFKATISALEAKIGQLEEQLE 
QEAKERAAANKLVRRTEKKLKEIFMQVEDERRHADQYKEQMEKA 
NARMKQLKRQLEEAEEEATRANASRRKLQRELDDATEANEGLSR 
EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 
TSDVNETQPPQSE 


5931 

. .. 


113 


6082 


RGNCFW I VPFTMAQRTGLEDPERYLF VDRAVI YNPATQADWTAK 
KLVWI PSERHGFEAAS IKEERGDEVMVELAENGKKAMVNKDD IQ 
KMMPPKFS KVEDMAELTCLNEAS VLKNLKDRYYSGLI YTYSGLF 
CVVINPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISESAYRCM 
LQDREDQS I LCTGESGAGKTENTKKV I Q YLAHVASSHKGRKDHN 
I PG E \XiERQLIiQANP I LES FGNARTVQNDNSSRFGKFIRINFDV 
TGYIVGANIETYLLEKSRAVRQAKDERTFHIFYQLLSGXAGEHL 
KSDLLLEGFNNYRFLSNGY IPI PGQ\QDKGNFRGDPGEAMH IMG 
FSHEEILSMLKWSS VLQFGNI S FKKERtJTDQASM PENTVAQKL 
CHLLGMNVMEFTRAI LTPR IKVGRD YVQ KAQTKEQAD FAVEALA 
KAT YERLFRWLVHR I NKALDRT KRQGAS FIGILD I AGFE IFELN 
SFEQLCINYTKTEKLQQLFNHTMFILEQEEYQREGIEWNFIDFGIi 
DLQPCIDLIERPANPPGVLAHiDEECWFPKATDECrFVEKLVQEQ 
GSHSKFQKPRQLKDKADFCIIHYAGKVDYKADEWLMKNMDPLND 
NVATLLHQS SDRFVAELMKDVDR I VGLDQVTGMTETAFGSAYKT 
KKGMFRTVGQLYKESLTKLMATLRNTNPNFVRCI I PKHEKRAGK 
LDPHLVLDQLRCNGVIjEGIRICRQGFPNRIVFQEFRQRyErLTP 
NAIPKGFMDGKQACERM I RAL ELD PNLYR I GQSKI FFRAGVLAH 
LE S E RDLKI TD 1 1 1 FFQAVCRG YLARKAEAKKQQQLS ALKVLQR 
NCAAYIiKLRHWQWWRVFTKVKPIOjQVTRQEEELQAKDEEIiLKVK 
EKQTKVEGELEEMBRKHQQLLEEKNILAEQLQAETELFAEAEEM 

DLEEQLDEEEGARQKLQLEKVTAEAKIKKMEEEILLLEDQNSKF 
IKE KKLMEDR I AECS SQLAEEEE KAKNLAKI RNKQEVMI S DLEE 
RLKKEEKTRQELEKAKRKLDGETTDLQDQ IAELQAQ IDELKLQL 
AKKEEEIiQGAIARGDDETLHKNNALKVVRELQAQIAELQEDFES 
EKASRNKAEKQKRDLSEELEALKTELEDTLDTTAAQQELRTKRE 
QEVAELKKALEEETKNHEAQIQDMRQRHATALEELSEQLEQAKR 
FKANLEKNKQGLETDNKELACEVKVLQQVKAESEHKRKKLDAQV 
QELHAKVSEGDRLRVELAEKASKLQNEIiDNVS TLLEEABKXGIK 
FAKDAASLESQLQDTQELLQEETRQKLNLSSRIRQLEEEKNSLQ 
EQQEEEEEARKNLEKQWLALQSQIADTKKKVDDDLGTIESLEEA 
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Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=*Aspartic Acid, E= 
Glutamic Acid, P=Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K=Ly3ine, 
L=Leucine, M*»Mcthionine, M-Asparagine , 
peproline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V=Valine, 
WsTryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KKKiLKDAEALSQRLEEKAIAYDKLEKTKNRLQQELDDbTVDLD 
HQRQVASNLEKKQ\KKFDQLLAEEKSISARYAEERDRAEAEARE 

t/ri>VJ\T c T HTJ1VT CO 7k T PXVDQ CDC HXTO AT OXAMTT>r MC* C> VTM"kTr/-* 

ACi 1 I^JjoijAKiV]jJilSALiC.AlVaiSl* a 

KNVHELEKSKRALEQQV\EEMRTQLEELEDELQATEDAKLRLEV 
NMQAMKAQFERDLQTRDEQNEEKKRLLIKQVRELEAELEDERKQ 
RALAVASKKKMEIDLKDLEAQIEAANKARDEVIKQLRKLQAQMK 
DYQRELEEARASRDEIFAQSKESEKKLKSLEAEILQLQEELASS 

ELEEEQSNMELLNDRFRKTTLQVDTLNAEIAAERSAAQKSDNAR 
QQLERQNKELKAKLQELEGAVKS KFKAT I SALEAKI GQLEEQLE 
QEAKERAAANKLVRRTE KKLKE I FMQVEDERRHAEQYKSQMEKA 
NARM KQLKRQLEE AEEEATRANASRRKLQRE LDDATE ANEGLSR 
EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 
TSDVNETQPPQSE 


5932 


33 


572 


RHLEEICFLFLQKGRKLKLSGPRWEEGKPRGTGGLWVKAEANMG 
FG ATLAVGLT I ? VLS WT 1 1 ICFTCSCCCLYKTCRRPRPV\APP 
PHPP/PWHAPYPQPPSVPPSYPGPSYQGYHTMPPQPGMPAAPY 
PMQYPPPYPAQPMGPPAYHETIiAGGAAAPYPASQPPYNPAYMDA 
PKAAL 


5933 


1 


3190 


GTRKLKMADKTPGGSQKASSKTRSSDVHSSGSSDAHMDASGPSD 
SDMPSRTRPKSPRKHNYRKESARESLCDSPMQNL5RPLLENKLK 
AFS IGKMSTAKRTLSKKEQEELKKKEDEKAAAEI YEEFLAAFEG 
SDGNKVKTFVRGGWNAAKEEKETDEKRGK1YKPSSRFADQKNP 
PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 
QEERDERHKTKGRLSRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 
DDYAPGSHDVGDPSTT\NF YLGN I \WPQMNLKKCCCQEFGRFGP 
LASVKIMWPRTDEERARERNCGFVAFMNRRDAERAbKNIiNGKMI 
MSFEMKLGWGKAVPIPPHPIYIPPSMMEHTLPPPPSGLPFNAQP 
RERLKMPNAPMLPPPKNKEDFEKTLSQAIVKVVIPrERNLlALI 
HRMIEFVVREGPMFEAM IMNRE INNPMFRFLFENQrPAHVYYRW 
KLYSIIiQGDSPTKWRTEDFRMFKNGSFWRPPPLNPYIjHGMSEEQ 
ETEAFVEEPSKKGALKEEQRDKLEEILRGLTPRKNDIGDAMVFC 
LNNAEAAEEIVDCITES LS I LKTPLPKKIARLYLVSDVLYNSSA 
KVANASYYRKFFETKLCQIFSDLNATYRTIQGHLQSENFKQRVM 
TCFRAWEDWAI YPEPFL IKLONIFt^LVNIIEEKETKD VPDDLD 
GAPIEEEIiDGAPLEDVDGI PIDATPIDDU3GVPI KSIiDDDLDGV 
P LDATEDS KKRE PI FKVAP S KWEAVDE S ELEAQAVTTSKWE LFD 
QHEESEEEENQNQEEESEDEEDTQSSKSEEHELYSMPIKEEMTE 
SKFSKYS EMSEEKRAKLRE I ELKVMKFQDELESGKRPKKPGQS F 
QEQVBHYR DKLLQREKEKELERERERDKKDKE KLESRS KDKKEK 
DECTPTRKERKRRHSTSPSPSRSSSGRRVKSPSPKSERSERSER 
SHKESSRSRSSHKDS PRDVS KKAKRS PSGSRTPXRSRRSRSRS P 
KKSGKKS RSQSRS PHRSHKKS KGKTNTGR ECF FKKAVT YWKCDL F 
LCPERSVF 


5934 


1 


3190 


GTRKLKMADKT PGGS QKAS S KTRSS DVHS SGSSDAHMDASG PS D 
SDMPS RTRPKS PRKHNYRNE S ARES LCDS PHQNLSRPLLENKLK 
AFSIGKMSTAKRTLSKKEQEELKKKEDEKAAAEIYEEFIiAAFEG 
SDGNKVKTFVRGGWNAAKEEHETDEKRGKIYKPSSRFADQKNP 
PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 
QEERDERHKTKGRLS RFE P PQS DSDGQRRS MDAPS RRNRSSG VL 
DDYAP GSHDVGDPS TT\ NF YLGN I \NP QMNLKKCCCQEFGRFGP 
LAS VKIMW P RTDEERARERNCGF VAFMNRRDAE RALKNLNGKM I 
MS FEMKLGWGKAVP I PPHP I YI PPSMMEHTLPPPPSGLP FNAQP 
RERLKNPNAPMLPPPKNKEDFEKTLSQAI VKWI PTERNLLAL I 
HRMIEFWREGPMFE AMIMNRE INNPMFRFLFENQT PAHVY YRW 
KLYSILQGDSPTKWRTEDFRMFXNGSFWRPPPLNPYLHGMSEEQ 
ETEAFVEE PS KKGALKEEQRDKLEE I LRGLTPR KND I GDAMVFC 
LNNAEAAEE I VDCI TESLSI LKTPLPKKIARLYLVSDVLYNSSA 
KVANAS YYRKFFETKLCQ I FS DLNAT YRT I QGHLQSENFKQRVM 
TCFRAWE DWAI YPB PFL I KLQNI FLGL VNI I EEKETED VPDDLD 
GAP I EE E LDG APLED VDG I P I DATP I DDLD G VP I KS L DDDLDG V 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K« Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q^Glut amine, R=Arginine, 
S=Serine> T^Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X-^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








PLDATEDS KKNEP I FKVAPS KWEAVDES ELE AQAVTTS KWE LFD 
QHE ESE EEENQNQEE ESEDEEDTQS S KS EEHHLYSNP I KEEMTE 
SKFSKYSEMSEEKRAKLREIELKVMKFQDELESGKRPKKPGQSF 
QEQVEHYRDKLLQREKEKELBRSRERDKKDKEKLBSRSKDKKEK 
DE CTPTRKERKRRKSTS PS PS RS S SGRRVKS PS PKS ERS ERS ER 
SHKESSRSRSSHKDSPRDVS KKAKRSPSGSRTPKRSRRS RSRSP 
KKSGKKSRSQS RSPHRSHKKS KGKTNTGRKFFKKAVTYWKCDLP 
LCPERSVF 


5935 


3 


4493 

• 


SYWLSGWRLSRPPRQFWAGWRGIGRFGTMAPVHGDDCEIGASAL 
SDSGSFVSSRARREKKSKKGRQEALERLKKAKAGERYKYEVEDF 
TGVYEEVDEEQ YS KLVQARQDDDWI VDDDGIG YVEDGRE I FDDD 
LEDDAIiDADE KGKDGKARNKD KRNVKKLAVTKPNN 1 KSMF I AC A 
GKKTADKAVDLSKDGLLGDILQDLNTETPQITPPPVM1LKKKRS 
IGASPNPFSVHTATAVPSGKIASPVSRKEPPIiTPVPLKRAEFAG 
DDVQVES TEEEQESGAMEFEDGDFDE PME VEEVDLE PMAAKAWD 
KESEPAEEVKQEADSGKGTVSYLGSFLPDVSCWDIDQEGDSSFS 
VQEVQVDSSHLPLVKGADEEQVFHFYWLDAYEDQYNQPGWFLF 
GKVWIESAETHVSCCVMVKMIERTLYFLPREMKIDriNTGKETGT 
P I SMKD VYEEFDEKI ATKYKI MKFKS KPVEKNYAFE I PDV PE KS 
E YLE VKYS AEMPQLPQDIjKGETFSHVFGTNTS S LEI*F LMNRKI K 
GPCWLSVKKSTALNQPVSWCKVRAKIALKPDLVWIKDVSPPPLV 
VMAFSMKTMQNAKNHQNBIIAMAALVHHSFALDKAAPKPPFQSH 
FCWS KPKDC I FP YAFKEVI EKKNVKVEVAATERTLLG FFIAKV 
HKIDPDIIVGHNIYGFELEVLLQRINVCKAPHWSKIGRLKRSNM 
PKLGGRSGFGERMATCGRMICDVEISAKELIRCKSYHriSELVQQ 
ILKTERWIPMENIQNMYSESSQIiIjYI*IjEHTWKDA\KFII*QIMC 
ELNVLPLALQITNIAGNIMSRTLMGGRSERNBFLLLHAFYENNY 
I VPDKQ I FRKPQQKLGDEDEE IDGDTNKYKKGRKKGAYAGGLVL 
DPKVGFYDKFI LLLDFNSLYPSI IQEFNICFTTVQRVASEAQKV 
TEDGEQSQIPELPDPSLEMGILPREIRKLVERRKQVKQLMKQQD 
LNPDLILQYDIRQKALKLTANSMYGCLGFSYSRFYAKPLAALVT 

wf opt* ,vfrj/Ti vciun rt*y VMMT . B*t / TVnrYT*T^ OTMT MTM Q TNFT >T? RV PITT . 

GNKVKSEVNKLYKL1^IDIDGVFKSLI*uIjKKKKYAAIjVVEPTSD 
GNYVTKQELKGLDIVRRDWCDLAKDTGNFVIGQILSDQSRDTIV 
ENIQKRIiIEIGENVLNGSVPVSQFEINKALTKDPQDYPDKKSLP 
HVHVALWINSQGGRKVKAGDTVSYVICQDGSNLTASQRAYAPEQ 
LQKQDNliTIDTQYYIiAQQIHPWARICEPIDGIDAVLIATGWEL 
\DPTQFKVHHYHKDEENDALLGGPAQLTDEEKYRDCERFKCPCP 
TCGTEN I YDNVFDGSGTDME PSLYRCSNIDCKAS PLTFTVQLSN 
KLIMDIRRFXKKYYDGWLICEEPTCRNRTRHLPLQFSRTGPLCP 
ACMICATLQPEYSDKSLYTQLCFYRYIFDAECALEKLTTDHEKDK 
LKKQFFTPKVLQDYRKLKNTAEQFLSRSGYSEVNLS KLFAGCAV 
KS 


5936 


1124 


139 


RGEEQ FDAEFRR FACLGFGERIjQEFSRLLRAVHRS RAWTCYIAI 
RMLMATC CPSPTTTACTG P WQRAPPLRLLVQKRE ADS SGLAFAS 
RSLQRRKKGLLLRPVAPIiRTRPPLLISLPQDFRQVSSVIDVDLL 
PETHRRVRLHKHGSDRPLGFYI RDGMS VRVAPQG\ LERVPG I FI 
S RIiVRGGLAESTGLLAVSDEl LE VNG I EVAGKTLNQVTDMMVAN 
SHN\L1VTVKPANQRNNWRGASGRLTGPPSAGPGPAEPDSDDD 
SSDLVIENRQPPSSNGLSQGPPCWDLHPGCRHPGTRSSLPSLDD 
QEQASSGWGSRIRGDGSGFSL 


5937 


31 


1600 


"PTSIiLKSTVQLMCRLLQDKRYQC\^SLAEIFKVliASFYVILVIL 
YGLTSS YSLWWMLRSSLKQYS FEALREKSNYSDI PDVKNDFAFI 
LHLADQYDPLYSKRFSIFLSEVSENKIiKQINLNNEWTVEKLKSK 
LVKNAQDKIELHLFMLNGLPDNVFELTEMEVLSLELIPEVKLPS 
AVSQLVNXjKELRVYHSSLWDHPAtiAFLEENLKILRLKFTEMGK 
IPRWVFHLICNLKELYLSGCVLPEQLSTMQLEGFQDLKNLRTLYL 
KSSLSRIPQWTDLliPSLQKLSLDNEGSKLVVIiNNLKKMVNL.KS 
XiEL I SCDLER I PHS I FSLP3NLHELDLRENNI»KTVE E 1 1 S FQHLQ 
NLSCU^WHNNIAYIPAQIGALSNLEQLSLDHNNIENLPLQLFL 
CTKLHYLDLS YNHLTFI PEE I QYL \SNLQ YFAVTNNNI EMLPDG 
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Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine / K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Prol ine , Q=Glutamine , R=Arginine , 
S=Serine, T=Threonine, V^Valine, 
W»Tryptophan, Y=Tyrosine, X ^Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LFQCKKLQCLUjGKNSLMWLSPHVGELSNLTHREPIG\WYIiETL 
P PEIjEGCQ SI) KRNCL I VE ENLLNTL PLPVTERLQTCLDKC 


5938 


395 


1865 


YKGEGFFCNQEARGERRKKKKAMS SPH I WSTGS SVYS TPVFSQK 
MT VW I LLLLS L YP<i FTS QKSDDDYED YASNKTWVLTP KVPEGD V 
TVILNNLLEGYDNKLRPDIGVKPTLIHTDMyVNSIGPVNAINME 
YTIDIFFAQTWYDRRIjKFNSTIKVLRLNSKMVGKIWIPDTFFRW 
S KKAD AHW I TT PNRW L R I WNDGR VL YS LiRLT I DAE CQ LQL»HN F P 
MDBHS CPhEFS S YG Y PR E E I V YQ WKRSS VE VGDTRS WR Ju YQ FS F 
VGLRNTTEVVKTTSGDYVVMSVYFDLSRRMGYFTIQTYIPCrLI 
WLSMVSFWINKDAVPARTSLGITTVIiTMTTLSTIARKSLPKVS 
YVTAMDLFVS VCFI FVFSALVE YG \TLII YFVSNRKPS KDKDKKK 
KNPAPTIDIRPRSATIQMNNATHIiQERDEEYGYECLDGKDCASF 
FCCFEDCRTGAWRHGRIHIRIAKMDSYARIFFPTAFCLFNLVYW 
VSYLYL 




66 


1404 


IRPGYLKEVQENSPGHRAGLEPFFDFIVSINGSRLNKDNDTLKD 
LLKANVEKPVKMLIYSSKTLELRETSVTPSDrLWGGQGLLGVSIR 
FCS FDGAKEKVWHVLE VESNS P AALAGLR PHSDY I IGADTVMNE 
SEDLFSLIETHEAKPLKLYVYNTDTDNCREVIITPNSAWGGEGS 
LG CG IG YG Y LIIR I PTRP FE EGKK I SLPGQMAGT P I TPIiKDGFTE 
VQLSSVNPPSLSPPGTTGIEQSLTGLSISSTP\PAVSSVIiSTGV 
PTVP\IiLPPQVNQSLTSVPPMESSYIiHLPGLMPFTRQG1jPNLPQ 
PSTFNLPR\PTHSWPGVGLYQEFVKPGVLPPLSSMPPRNLPG\I 
APLPLPSEFLPSFPLVPESSSAASSGELLSSLPPTSNAPSDPAT 
TTAKADAAS53LTVDVTPPTAKAPTTVEDRVGDSTPVSEKPVSAA 
VBANASESP 


5940 


145 


717 


RRSASRSASPRQSAGTAVTTGTRAGGTCLAAAHHRMRWRADGRS 
LEKLPVHMGLV1TSVEQEPSFSDIASLWWCMAVGISYISVYDH 
QGIFKRNNSRbMDEILKQQQEIjLGIiDCSKYSPEFANSNDKDDQV 
I^CHIAVKVLSPEDGKADIVRAAQDFCQLVAQKQKRPTDLDVDT 
LA\VYLVQMWLILI 


5941 


13 


6147 


MCLGRMGASSPRSPEPVGPPAPGLPFCCGGSLLAVWLLALPVA 
WGQCNA?EW\LPFARPTNLTDEFEFPIGTYLNYECRPGYSGRPF 
S I ICLKNS VWTGAKDRCRRKS CRNPPDPVNGKVHVIKGIQFGSQ 
I KYSCTKGYRLIGSSSATCI ISGDTVI WDNETP ICDRIPCGLP P 
TITNGDFI STJNREWFH YGS WTYRCN PGSGGRKVFEL VGEPS I Y 
CTSNDDQVGIWSGPAPCCIIPNKCTPPNVENGILVSDNRSLFSL 
NEWEFRCQPGFVMKGPRRVKCQALNKWEPELPSCSRVCQPPPD 
VLHAERTQRD KDNFS PGQEVF YSCEPG YDLRGAASMRCTPQGDW 
S PAAPTCEVKSCDDFMGQLLNGRVLFPVNLQLGAKVDFVCDEGF 
QLKGSSASYCVLAGMESLWNSSVPVCEQIFCPSPPVIPNGRHTG 
KPLEVFP FGKAVNYTCDPHPDRGTSFDL I GESTIRCTSDPQGNG 
VWSS PAPRCG1LGHCQAPDHFLFAKLKTQTNASDFP IGTSLKYE 
CRPE YYGRPFS I TCLDNLVWSS PKDVCKR^SCKTPPDPVNGMVH 
VITDIQVGSRINYSCTTGHRLIGHSSAECILSGNAAHWSTKPPX 
CQRIPCGLPPTIANGDFISTNRENFHYGSWTYRCWPGSGGRKV 
FEL VGEPS I YCTSNDDQVG I HSGPAPQ C I IPNKCTP PNVENG IL 
VSDNRSLFSLtJfiWEFRCQPGFVMKGPRRVKCQALITKWEPELPS 
CSRVCQPPPDVLHAERTQRDKDKFSPGQEVFYSCEPGYOLRGAA 
SI4RCTPQGDWSPAAPTCEVKSCDDFMGQIjIiNGRVLFPVNLQLGA 
KVDFVCDEG FQLKGSS AS YCVLAGMESLWNS S VPVCEQI FCPSP 
PVIPNGRHTGKPLEVFPFGKAVNYTCDPHPDRGTSFDHGESTI 
RCTSDPQGNG VWSS PAPRCG I LGHCQ AP DHFLFAKL KTQTNAS D 
FP IGTSLKYECRPE YYGRPFS I TCLDNLVWS S PKDVCKRKSCKT 
PPDPVNGMVHVITDIQVGSRINYSCTTGHRLIGHSSAECILSGN 
TAHWSTKPPIC^RIPCWLPPTIANGDFISTNTR3NFHYGSVVTYR 
CNLGSRGRKVFEIiVGEPSI YCTSNDDQVG I WSGPAPQCI I PNKC 
TP PNVENGI LVSDNRS LFSLNEWEFRCQPG FVMKG PRRVKCQ A 
LHKWEPBLPSCSRVCQPPPEI tiHGEHTPSHQDNFS PGQEVFYS C 
EPGYDLRGAASLHCTPQGDWSPEAPRCAVKS CDDFLGQLPHGRV 
LFPLNLQLGAKVSFVCDEGFRLKGSSVSHCVLVGMRSLWWNSVP 
VCEHIFCPNPPAILMGRHTGTPSGDI PYGKE r S YTCDPHPDRGM 
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Predicted end 
nucleotide 
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Amino acid segment containing signal peptide 
{A«Alanine, C=Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid, F=Fhenylalanine, G-Glycine, 
H=Histidine, I^Isoleucine, K^Lysine, 
IisLeucine, M=Methionine , N=*Asparagine , 
P~ Proline, Q-Glutamine, R^Arginine, 
S=Serine, T«Threonine, V= Valine, 
W«.Tryptophan, Y^Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion} 








TFNLIGESTI RCTSDPHGNGVWSS PAPRCELS VRAGHCKTPEQP 
P FAS PIT P I ND FE FP VGTS LNYE CR PG YFG KMFS I S CLENLVWS 
SVEDNCRRKSCGPPPEPFNGMVHINTDTQFGSTVNYSCNEGFRL 
IGSPSTTCLVSGNNVTWDKKAP ICE I ISCE P PPTISNGDFYSNN 
RTS FHNGTWTYQCHTGPDGEQLFELVGERS I YCTSKDDQVGVW 
SSPPPRCISTNKCTAPBVENAIRVPGNRSFFSLTEIIRFRCQPG 
FVMVGSHTVQCQTNGRWGPKLPHCSRVCQPPPEILHGEHTLSHQ 
DNFS PGQE VF YS CEP S YDLRG AASLHCTPQGDWS PEAPRCT VKS 
CDDFLGQLPHGRVLLPLNLQLGAKVSFVCDEGFRLKGRSASHCV 
LAGMKALWNSS VPVCEQI FCPNPPAI l*NGRHTGTPLGDI PYGKE 
VS YTCD PHPDRGMTFNL I GEST I RRTSEPHGNGVWS S PAPRCEL 
PVGAACPHPPKI QNGHYIGGHVSLYLPGMT IS YTCDPGYLLVGK 
GF I FCTDQGI WSQLDH YCKEVNCSFPLFMNG I S KELEMKKVYHY 
GDYVTLKCEDGYTLEGSPWSQCQADDRWDPPLAKCTSRTHDALI 
VGTLSGTIFFI LLI I FLSWI ILKHRKGNNAHENPKE VAIHLHSQ 
GGSSVHPRTLQTNEENSRVLP 


5942 


4509 


688 


YLY\T*MRANPIAYGISHKAYQIDPPL\RKHREQ\I,VIE\VGRKL " 
DK\AQM t RFEERTGYFS STDLGRTASHYYTKYNT I ETFNELFDA 
HKTEGDIFAIVSKAEEFDQIKVREEEXEELDTLLSNFCEIjSTPG 
GVENSYGKINILLQTYINRGEMDSFSLISDSAYVAQMAARIVRA 
LFE IAIiRKRWPTMTYRLliNliSKAIDKRLWGWASPLRQFSILPPH 
MLTRLEEKKLTVDKIiKDMRKDEIGHI IiHHVNIGLKVKQCVHQ I P 
S VMM EAF I QPITRT VIiRVTLS I YAD FTWNDQVHGT VGE P W W I WV 
EDPTNDHI YHSEYFLALKKQVI SKEAQLI*VFTI PX FEPLPSQ YY 
IRAVSDRWLGAEAVCIINFQHLILPERHPPHTELLDIiQPnPITA 
LGCKAYEALYNFSHFNPVQTQI FHTLYHTDCNVLLGAPTGSGKT 
VAAELAI FR VFN KY P TS KAVYI APLKAL VRERMDD W KVR I E E KEj 
G KKVIELTGD VTPDMKS I AKADL I VTTPEKWCGVS RS WQNRNYV 
QQVTILI IDEIHLLGEERGPVLEVI VS RTNFISSHTEKPVRIVG 
LSTALANARDLADWLNIKQMGLFNFRPSVRPVPLEVHIQGFPGQ 
H YCPRMASMNKPAFQAIRSHS PA KP VL I FVSSRRQTRLTALEL I 
AFLATEEDPKQWLNMDEREMENIIATVRDSNLKLTLAFGIGMHH 
AGLHERDRKTVEELFVNCKVQVLIATSTIAWGVNFPAHLVI I KG 
TE Y YDGKTRRYVDFP I TDVLQMMGRAGRPQ FDDQGKAVI L VHD I 
KKDFY KKFL YEPFP VES SIAjG VLSDHLNAE T AGGT I TS KODALD 
YITWTYFFRRLIMNPSYYNLGDVSHDSVNKFIjSHLIEKSLIELE 
LS YCIE IGEDNRS IEPLTYGRIASYYYLKHQTVKMFKDRLKPEC 
STEELLS I liSDAEEYTDLPVRHNEDHMNSE tiAKCLP I ESNPHSF 
DSPHTKAHLLLQAHLSRAMLP CP DYDTDTKT VLDQALRVCQAML 
DVAANQGWLVTVLNITNLIQMVIQGRWLKDSSLLTLPNIENHHL 
HLFKKWKPIMKGPHARGRTSIECLPELIHACGGKDHVFSSMVES 
ELHAAKTKQAWNFLSHLPEINVGrSVKGSWDDLVEGHNELSVST 
LTADKRDDNKWIKLHADQEYVLQVSLQRVHFGFHKGKPESCAVT 
PRFPKSKDEGWFLILGEVDKRELIALKRVGYIRNHHVASIjSFYT 
PEIPGRYI YTLYFMSDCYLGLDQQYD/NLSQRYTSES FCT5QHQ 
GL 


5943 


1 


2274 


DKPTRHKTYLSSSWAKMAAAEGPVGDGELWQTWLPNHWFLRLR 
EGLKNQS PTE AEKPAS SSLPSS P PPQLLTRNWFGLGGELFLWD 
GEDSSFLWRLRGPSGGG\ EEPALSQYQRLLCINPPL FEIYQVL 
hS PTQHHVAL I GI KGLMVIjELP KRWGKNSE FEGGKSTVNCS TTP 
VAERFFTSSTSLTLKHAAWYPSEIL.DPHWLLTSDKVIRIYSLR 
EPQTPTNVI I IiSE AEEES LVLNKG RAYXAS LGETAVAFDFGPLA 
AVPKTLFGQNGKDEWAYPLY I LY ENG ETFLTY I S LLHS PGN/ 1 
WKAVGS IAHAS \AAEDNYGYDACAVLCLPCVPNI LVIATESGML 
YHCWLEGEEEDDHTSEKSWDSRIDLIPSLYVFECVELELALKL 
AS GEDDPFDS DFSCP VKLHRDPKC P SRYHCTHEAGVHS VGLTW I 
HKLHKFLGSDEEDKDSLQELSTEQKCFVEHILCTKPLPCRQPAP 
IRGFWIVPDILGPTMICITSTYECLIWPLLSTVHPASPPLLCTR 
EDVE VAES PIiRVLABTPDS FEKHIRS ILQRSVANPAFLXAS EKD 
lAPPPEECLQLLSRATQVFREQYIIiKQDLAKEEIQRRVKLLCDQ 
KKKQIi EDL S YCR E ER KS LREMAERLAD KYEEAKE KQED I MNRMK 
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SEQ 
ID 
NO: 


Predicted 

nucleotide 
location 
co r r e spon ding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nvtc 1 so t ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

— Ckl aninp CcjOvs tsine r>=&jemairt" ic Ac? Ad E= 

Glutamic. Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, Islsoleucine, K»Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P= Proline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, V-Valine, 
W -Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 








KLLHS FHSELP VLS DS ERDMXKELQL I PDQLRHLGNAI KQVTMK 
KDYQQQKMEKVLSLPKPT 1 1 LSAYQRKCIQSILKEEGEH I RBMV 
KQINDIRNHVNF 


5944 


167 


3428 


FS I ATFTDEPEVLTEPPS ATTTTTIG I SATWTTLAGSHGKRNNT 
ITTTSSKRKNRKNKITPENVQIIFDDPLPISYSQPEKVNGESKS 
SSTSESGDSDNMRISSCSDESSNSNSSRKSDNHSPAVVTTTVSS 
KKQ PS VL VTFPKEERXS VSGKAS I KLS ETISEGTSNS LS TCTKS 
GPS PLS S PNGKLT VAS PKRGQKRE EGWKEVVRRS KXVS VP STVI 
SR VIGRGGCNINA I RE FTGAHIDI DKQ KDKTGDR 1 1 T I RGGTE3 
TRQATQLINALIKDPDKEIDELIPKNRLKSSSANSKIGSSAPTT 
TAANTSLMG I KMTTVALS STS QTATALT VPAI S SAS THKT I KN P 
VN\NVR PGFP VS F P \ LAYP PPQ F AHALLAAQTFQQ I RP PRLPMT 
HFGGTF P PAQS TWGP FP VRPLS PARATNS PKPHMVPRHSNQWS S 
GS QVNS AGS LTS S PT TTTS S S ASTVPGT STNG S PS S P S VRRQLF 
VTWKTSNATTTTVTTTASNNNTAPTNATYPMPTAKEHYPVSS? 
SS PS PPAQPGGVSRNS PLDCGS AS PNKVAS S S EQE AGSP P WET 
TNTRP PNS S SS SGS S SAHSNQQQPFGS VS QEPRP P JbQQS Q VP P P 
E VRMT VP P LATS SA P VAVPSTAP VT YPMPQTPMGCP Q PT PKMET 
?AI RP P PHGTTAPHKNS AS VQNS S VAVLS VNH I KRPHSVPSS VQ 
LPSTLSTQSACQNSVHPANKPIAPNFSAPLPFGPFSTLFENSPT 
SAHAFWGGSWSSQSTPESMLSGKSSYLPNSDPLHQSDTSKAPG 
FR PPLQRPAPS PSG I VNMDS P YGS VT P S S THLGNFASNI SGGQM 
YGPGA PbGGAPAAANFNRQHFS PLSLLTPCSSASNDSSAQS VSS 
GVRAPS PAPSSVPLGS EK P SNVS QDRKVP VPIGTERSAR I RQTG 
TSAPSVIGSNLSTSVGHSGIWSFEGIGGNQDECVDWCNPGMGNPM 
IHRPMSDPGVFSQHQAMERDSTGIVTPSGTFHQHVPAGYMDFPK 

MVSSSTENNGPQTVWTGPMAPHMHSVHMNQLG 


5945 


1461 


197 


GVTHLFLFGKRKLRNG I AEDL KGQADFF FLL VS E A WATGSPRA 
WLTCLILPIiPGIIFSVLPKAMSRPLLITFTPATDPSDLWKDGQQ 
QPQPEKPESTLEX3AAARAFYEAIiIGDESSAPDSQRSQTEPARER 
KRKKRR IMKAPAAEAVAEGAS GRHGQGRSLEAEDKMTHRILRAA 
QEGDLPELRRLLEPHEAGGAGGNINARDAFWWTPLMCAARAGQG 
AAVS YLLGRGAAWVGVCE LSGRDAAQLAEEAGF PE VARMVRE SH 
GETRSPBNRSPTPSLQYCENCDTHFQDSNHRTSTAKLLSLSQGP 
QPPNLPLGVPISSPGFKLLLRGGWEPGMGLGPRGEGRANPIPTV 
LKRDQEGLGYRSAPQPRVTHFPAWDTRAVAGRE\TPPRVATLSW 
REERRREE \KDRAWERDLRTYMNLEF 


5946 


541 


1666 


ILGS YSS IQPEE YS \SWC\EWLQDLLA\ YVS PK\HS YLRDLP 
SEGS PQRVNS IDFV\ EL\EHLQPDVLVHAVLRWDF / TI LTEAV 
YS YRGQKQKKVMLTVEQAQDQHYALVLWGPGAAW \ YPQLQRKKG 
YI WE FKYLFVQCNYTLENLELHTTPWSSCE CLFDDDIRAITFKA 
KFQKSAPSFVKISDLATHLEDKCSGVVLIKAQISELAFPITASQ 
KIALNAHSSIiKSIFSSLPNIVYTGCAKCGIjEI»ETDENRIYKQCF 
SCLPFTMKKIYYRPALMTAIDGRHDVCIRVESKLIEKILLNISA 
DCLNRVI VPS S E I T YGMVVADLFHS LLAVS AE PC VLK I QSLFVL 
DENSYPLQQDFSLLDFYPDIVKHGANARL 


5947 


3 


1317 


RG I PDRRRRGP IGRVNMDLENKVKKMGLGHEQGFGAPCLKCKEK 
CEGFELHFWRKICRNCXNVAKKSM/TVLLSNEEDRKVGKLFSDT 
KYTTL I AKLKSDGI PM YKRNVM I LTNP VAAKKNVS INTVTYEWA 
P P VQNQALARQ YMQML FKE KQP VAGSEG AQYR KJCQLAKQ L P AHD 
QDPSKCHELSPREVKEMEQFVKKYKSEALGVGDVKLPCEMDAQG 
2KQMN I PGGDRS TPAAVGAMEDKSAEHKRTQ YS C YCCKLS MKEG 
D PAI YAERAG YDKLWHPAC F VCSTCHELLVDMI Y FWKNEKL YCG 
RHYCDSEKPRCAGCa3ELIFSNEYTQAENQiNWHLKHFCCFDCDSI 
LAGE I YVMVNDKPVCKPCYVKNHAWCQGCHNAI DPEVQRVTYN 
NFS WHASTE CFLCS CCSKCLIGQK FMP VEGMVF CS VE CKKRMS 


5948 


39 


3370 


YRE R YP VS GGS VLRS ALE VC WDFLSGLT EGS LLP EGFFS GP I DQ 
GNHYQMRRKGRCHRGSAARHPSSPCSVKHSPTRETLTYAQAQRM 
VEIEIEGRLHRIS I FDPLEI I LEDDLTAQEMSE CNSNKENSERP 
PVCLRTKRHKNNRVKKKNEALPSAHGTPASASALPEPKVRIVEY 
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SEQ 
ID 
MO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E<» 
Glutamic Acid, F=Phenylalanine, G*Glycine, 
H=Histidine, I=Isoleucine, K=> Lysine, 
L^Leucine, M=Methionine, N-Asparagine, 
P= Proline, Q=Glut amine, R-Arginine, 
S^Serine, T=Threonine, V=Valine f 
W=Tryptophan, Y=Tyrosine, X=Unknown, *s=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SPPSAPRRPPVYYKFIBKSAEEliDNSVEYDMDSEDYAWrjBIVWE 
KRKGOCVPAVSQSMFEFLMDRFEKESHCENQKQGBQQSLXDEDA 
VCCICMDGECQNSNVTLFCDMCNLAVHQECYGVPYIPEGQWIiC/ 
RAHCLQSRARPADCVLCPNKGGAFKKTDDDRWGHV\VCALW\IP 
E\VGFANTVFIEPIDGVRNIPPARWKLT\CWLCKEKGR/VGACI 
QCHKANC YTAFHVTCAQKAGLYM KME PVKELTGGG TTFS VRKTA 
YCDVHTP PGCTRRPLN I YGDVEMKNG VCRKES S VKT VRS TS KVR 
KKAKKAKKALAEPCAVLPTVCAPYIPPQRLNRIANQVAIQRKKQ 
F VERAHS YWLLKRLS RNGAPLLRRLQSS LQSQRS SQQRENDEE M 
KAAKEKLKYWQRLRHDLERARLLIELIiRKREKIiKREQVKVEQVA 
MELRLTPLTVLIiR3 VLDQLQDKD PAR I FAQ P VSLKEVPD YLDH I 
KHPMDFATMRKRLEAQGYKNLHEFEEDFDLIIDNCMKYNARDTV 
FYRAAVRLRDQGGWLRQARREVDSIGLEEASGMHLPERPAAAP 
RRPFSWEDVDRLLDPANRAHLGLEEQLRELLDMLDLTCAMKSSG 
SRSKRAKLLKKEIALLRNKLSQQHSQPLPTGPGLEGFEEDGAAL 
GPEAGEEVLPRLETLIiQPRKRSRSTCGDSEVEEESPGKRLDAGlt 
TNGFGGARSEQEPGGGLGRKATPRRRCASESSISSSNSPLCDSS 
FNAPKCGRGKPALVRRHTLEDRSELISCIENGNYAKAARZAAEV 
GQSSMWXSTDAAASVLEPLKWWAKCSGYPSYPALIIDPKMPRV 
PGHHNGVTIPAPPIiDVLKIGEHMQTKSDEKLFLVLFFDNKRSWQ 
WLPKSKMVPLGIDETIDKLKMMEGRNSSIRKAVRIAFDRAMNHL 
SRVHGEPTSDLSDID 


5949 


39 


3370 


YRERYPVSGGSVLRSAIjEVCWDFLSGLTEGSLLPEGFFSGPIDQ - 

GNHYQMRRKGRCHRGSAARHPSSPCSVKHSPTRETLTYAQAQRM 

VEI B IEGRLHR I S I FDPLE 1 1 LEDDLTAQEMS ECNSNKEWS ERP 

PVCLRTKRHKNNRVKKKNEALPSAHGTPASASALPEPKVRIV3Y 

S PPS APRR PP VY YK F IE KSAEELDNE VE YDMDEED YAWLE I VNE 

KRKGDCVPAVSQSMFEFLMDRFEKESHCENQKQGEQQSLIDEDA 

VCTCICMDGECQNSNVILFCDMCNLAVHQSCYGVPYIPEGQWLC/ 

RAHCLQSRARPADCVLiCPNKGGAFKKTDDDRWGHV\VCALW\ I P 

E \ VG FANTVFIE P IDGVRN I PPARW KLT \ CNLCKEKGR/VGAC I 

QCHKANCYTAFHVTCAQKAGLYMKMEPVKEIjTGGGTTFSVRKTA 

YCDVHTPPGCTRRPLKflYGDVEMKNGVCRKESSVKTVRSTSKVR 

KKAKKAKKALAEPCAVLPTVCAPYIPPQRLNRIANQVAIQRKKQ 

FVERAHSYWLLKRLSRKGAPLLRRLQSSLQSQRSSQQRENDEEM 

KAAKEKLKYWQRIiRHDLERARLLIELLRKREKLKREQVKVEQVA 

MELRLTPLIVLJJRSVLDQLQDKDPAR I FAQPVSLKE VPD YLDHI 

KHPMDFATMRKRIiBAQGYKNLHEFEEDFDLriDWCMKYNARDTV 

FYRAAVRLRDQGGVVLRQARREVDSIGLEEASGMHLPERPAAAP 

RRPFSWEDVDRIiLDPANRAHLGLEEQLRELLDMLDLTCAMKSSG 

SRSKRAKLLKKEIALr,RNKLSQQHSQPLPTGPGLEGFEEDGAAL 

GPEAGEEVLPRLETLLQPRXRSRSTCGDSEVEEESPGKRLDAGL 

TNGFGGARSBQEPGGGLGRKATPRRRCASESSISSSKTSPLCDSS 

FNAPKCGRGKPAI.VRRHTLEDRSELI SCI ENGNYAKAAR I AAE V 

GQSSMWISTDAAASVLE PLKWWAKCSGYPS YPALI I DPKMPRV 

PGHHNGVTIPAPPLDVLKrGEHMQTKSDEKLFLVLFFDWKRSWQ 

WLPKSKMVPLGIDETIDKLKMMEGRNSSIRKAVRIAFDRAMNHL 

SRVHGEPTSDLSDID 


5950 


1166 


3 73 


ESRSLTMSTSQPGACPCQGAASRPAILYALL3SSLKAVPRPRSR 
CLCRQHRPVQLCAPHRTCREALDVLAKTVAFLRNLPSFWQLPPQ 
DORRLLQGCWGPLFLLGLAQDAVTFEVAEAPVPSILKKILLEEP 
SSSGGSGQLPDRPQPSLAAVQWLQCCLESFWSLELSPKE\YACL 
KGPILFNPDVPGLQAASHIGHLQQEAHWVLCEVLEPWCPAAQGR 
LTRVLLTASTLKS I PTSLLGDLFFRP I IGDVD I AGLLGDMLLLR 


5951 


143 


5449 


WNVKPSLLWQLFKFSDKEEHEQNDS I SGKTGETGVEEMIATRK " 

VEQDSKETVKLSHEDDHILEDAGSSDISSDAACriJPNKTENSLV 

GLPSCVDEVTECNLELKDTMGIADKTENTLERNKIEPLGYCEDA 

ESNRQLESTEFNKSNIjEWDTSTFGPESNILEWAICDVPDQNSK 

QLNAIESTKIESHETANLQDDRNSQSSSVSYLESKSVKSKHTKP 

VIHSKQNMTTDAPKKIVAAKYEVIHSKTKVNVKSVKRNTDVPES 

QQNFHRPVTCVRKKQIDKEPKIQSCNSGVKSVKNQAHSVLKPCTLQ 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
{A= Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, ■ 
H=Histidine, I«Isoleucine, K=Lysine, 
L*- Leucine t M=Methionine, N»Asparagine, 
P^Proline, Q=Glutamine, Rs=Arginine, 
S=Serane, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X=Unloiown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DQTLVQ I PKPLTHSLS DKSHAHPGCIiKEPHHPAQTGHVSHS SQK 
QCHKPQQQAPAMKTNSHVKBELEHPGVEHFKEEDKLKIiKKPEKN 
LQPRQRRSSKS FSLDE PPLFI PDNI ATIRREGSDHSS S FESKYM 
WTP 3 KQ CG FCKKPHGNRFMVG CGRCDDW FHGDCVGtiS LS QAQQM 
GEEDKEYVCVKCCAEEDKKTEILDPDTIiENQATVEFHSGDKTME 
CEKLGL S KHTTNDRT KY I DDT VKHKVKI LKR E SGEGRNS SD CRD 
NE I KKWQLAPL RKMG Q PVLPRRS SEEKS E KI P KESTTVTCTGEK 
AS KPGTHE KQEMKKKECV \E KG VLNVHPAAS AS KPS ADQ IRQSVR 
HS LKDI LMKRLTDSNLKVPEE KAAKVATKIEKELFS FFRDTDAK 
yKNKYRSLMFNLKDPKNNILFKKVLKGEVTPDHLIRMSPEEIiAS 
KELAAWRRRENRHTIEMIEKEQREVERRPITKITKKGEIEIESD 
APMKEQE AAME I QEPAANKS IiE KPEGS E K\ RXEEVDS MSKDTTS 
QHRQHLFDLNCKICIGRMAPPVDDLSPKKVKVWGVARKHSDKE 
AES I ADALS STSNILASEFFEEEKQES PKSTPSPAPRPEMPGTV 
EVESTFLARIiNFI WKGF INMPS VAKFVTKAYPVSGS PE YLTEDI* 
PDS IQVGGRIS PQTVWDYVEKI KASGT KE I CWRFTP VTEEDQ I 
SYTLLFAyFSSRKRYGVAANNMKQVKDMYLIPLGATDKIPHPLV 
P FDGPGLE LHRPNLIiLGL 1 IRQKLKRQHS ACASTSH I AETPE S A 
PP I ALPPDKKSKIEVS TEEAPEEENDFFNS FTTVLHKQRNKPQQ 
NLQEDIiPTAVEPLMEVTKQEPPKPI/RFX/PGVLIGWENOPTTLEL 
AKKPLP VDDILQSLLGTTGQVYDQ\AQSVMEQNTVKE I PFLNEQ 
TNS KI EKTDNVEVTDGENKE IKVKVDNISESTDKSAE IETS WG 

CC0TC&/3OT TCT CT .DfZr CD Prt / G rnf » n*T I'M F OTAC<V/>otiimrpnvn 
OoolohbaL 1 al^oussxriLrt'LJVla X LAf Jj±JNLto J.y&j£yKbT VESKE 

KTLKRQLQEDQENNLQDNQTSNSSPCRSNVGKGNIDGNVSCSEN 
LVANTARS PQF INLKRD PRQ AAGRS QPVTTSESKDGDSCRiTGEK 
HMLPGLSHNKEHLTEQ INVEEKLCSAEKNS CVQQSDNLKVAQNS 
PSVENI QTSQAEQAKP LQED I LMQNI ETVHPFRRGSAVATSHFE 
VGNTCPSEF PS KS ITFTSRSTS PRTSTNFSPMRPQQPWLQHLKS 
SPPGFPFPGPPNFPPQSMFGFPPHLPPPLLPPPGFG\FA\QNPM 
VPWPPW\HLP\GQPQRMMGPLSQASRYIGPQNFYQVKDIRRPE 
RRHSDPWGRQDQQQLDRPFNRGKGDRQRFYSDSHHLKRERHEKE 
WEQES ERHRRRDRS QDKDRDRKSREEGHKDKERARLSHGDRGTD 
GKASRDSRNVDKKPDKPKSEDYEKDKEREKSKHREGEKDRDRYH 
KDRDHTDRTKSKR 


5952 


3226 


639 


PPARRSARDLPRALSMEAARPSGSWNGALCRLL\LVTL\AFLIF 
ASDACKNVTLHVPSKLDAEKLVGRVNLKECFTAANLIHSSDPDF 
QI LEDGS VYTTNTI LLSS EKRS FTI LLSNTENQE KKK I F VFLEH 
QTKVLKKRHTKEKVLRRAKRRWAPIPCSMLENSLGPFPLFLQQV 
QSDTAQNYTI YYS I RGPGVDQEPRNL F YVERDTGNLYCTRPVDR 
EQYES FEI IAFATTPDG YTPELPLPL I IKIEDENDN YP I FTEET 
YT FT I FENCRVGTTVGQVCATDKDE PDTMHTRLKYS I IGQVPPS 
PTLFSMHPTTGV1 TTTSSQLDRELIDKYQLKI KVQDMDGQ YFGb 
QTTSTCI INIDDVNDHLPTFTRTSYVTSVEENTVDVEILRVTVE 
DKDLVNTANWRANYT I LKGNENGNFK I VTDAKTNEGVL CWKPL 
NYEEKQQMILOIGVVNEAPFSREASPRSAMSTATVTVNVEDQDE 
GPECN P P I QTVRM KEN AEVGTTSNG YKAYDP E TRS SS G IRYKKI* 
TDPTGW VT I DENTGS I KVFRSLDRE AET I KNG I YN IT VIASDQG 
GRTCTGTLGI ILQDVNDNS PFI PKKTVI ICKPTMSSAEIVAVDP 
DEPIHGPPFDFS LESSTSE VQRMWRLKAINDTAARLS YQNDPPF 
GS Y WP I TVRDRLGMS S VTS VTLCDC I TEND CTHR VDPRIGG 
GGVQLGKWAILAILLGIALFFCILFTIiVCGASGTSKQPKVIPDD 
LAQQNLIVSNTEAPGDDKVYSANGFTTQTVGASAQGVCGTVGSG 
IKNGGQETI EMVKGGHQTSESCRGAGHHHTLDS CRGGHTEVDNC 
RYTYSEWHS FTQPRLGEES IRGHTL IKKT 


5953 


330 


811 


PLLCNPDPGWYWWVKQESE I S KESQEMDARPKIiDIiGFKEGQTIK 
LCIGNITNKKGGASKPRTARGGGLSI,LPPPPGGKVTIPPPSS/V 
KLPSTNHVTPPSIPKSNHGGSDADILLDLDS PAPVTTPAPTPVS 
VSNDLWGDFSTASSSVPNQAPQPSNWVQF 


5954 


32 


2130 


PPPPPPKLANMADLEAVXiADVSYLMAMEKSKATPAARASKRlVl* 
PEPSIRSVMQKYLAERNEITFDXIFNQKIGFbLFKDFCLNEINE 
AVPQVKF YEEIKE YEKLDNEEDRLCESRQ I YDAYI MKELLS CSH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G*=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, MaMethionine, N=Asparagine, 
P=Proline, Q=Glutamine , R=Arginine, 
S=S erine, T«Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X»Un3cnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PFS KQAVEKVQSHLSKKQVTSTLFQPYI EE I CESLRGD I FQKFM 
ESDKFTRFCQWKNVELN IHLTMNE FS VHRI IGRGGFGEVYGCRK 
ADTGKMYAMKCLNKKRI KMKQGETLALNERI MLSLVS TGDCPFI 
VCMTYAFHTPDKLCFILDIiMNGGDLHYHLSQHGVFSEKEMRFYA 
TE 1 ILGLEHMHNRFVVYRDLKPANILLDEHGHARIS \DLGLACD 
FSKKKPHASVGTHGYMAPEVIjQKGTAYDSSADWFSIK5CMLFKLI> 
RGHSPFRQHKTKDKHEIDRMrLTVWVELPDTFSPELKSIJjEGLL 
CRDVSKRLGCHGGGSQEVKEHSFFKGVDWQHVYLQKYPPPLIPP 
RGE VNAADAFDIGSFDEEDTKG I KLLDCDQELYKKTFPLVI SERW 
GQBVTETVYEAVNADTDKI EARKRAKNKQLGHEEDYALGKDCIM 
HGYMLKLGKPFLTQWQRRYFYLFPNRIiEWRGEGESRQNIiIjTMEQ 

ilsveetqikdkkcilfrikggkqfvlqcesdpefvqwkkelne 
tfkeaqrllrrapkflnkprsgtvelpkpslchrnsngl 


5955 


1726 


444 


KREREFRIAVCPLRYPSAYESSPGTEL.RECGLCRSGQEFADCRR ' 
PANRQDVLSGW INLPVLQLTKDPLKTPGRLDHGTRTAF IHHREQ 
VWKRCINIWRDVGLFGVLNEIANSEEEVFEWVKTASGWAliALCR 
WAS SLHGS L FPHLSLRSEDLI AEFAQVTNWS S CCLRVFAWHPHT 
NKFAVALLDDS VRVYNAS STI VPS LKHRLQRNVASLAWKPLSAS 
VLAVACQSC ILI WTLDPTS LSTRPS SGCAQVLSHPGHTP VTS LA 
WAPSGGRLLSASPVDAAIRVWDVSTETCVPLPWFRGGGVTNLLW 
SPDGS KILATTPSAVFRVWEAQMWTCERWPTLSGRCQTGCWS PD 
GSRLLFTVLGEPLIYSLSFPERCGEGKG\ALEVQSQQRLWQICL 
RQQYRHQMVRRGLGERLTPWSGTPVGNVWLCL 


5956 


1705 


139 


GVGVRGARAMATVQEKAAALNDSALHS PAHRPPGFSVAQKP FGA 
TYVWS S I INTLQTQVEVKKRRHRLKRHNDCFVGSEAVDVI FSHL 
1 QNKYFGDVDI PRAKWRVCQALMD Y KVFEAVPT KVFGKDKKPT 
FEDS 5 CS LYR FTT I PNQDS QLGKENKL YS PAR YADALFKS SD I R 
SASLEDLWENLSLKPANSPHVNISATLSPQVINEVWQEETIGRL 
LQLVDLPLLDSLLKOQEAVPKIPQPKRQSTMVNSSNYLDRGIL.K 
AYSDSQEDEMLSAA I DCSB YLPDQMWEI SRSFPEQPDRTDLVK 
ELLFDAI GRYYS S REPLLNHLS DVHNG I AELLVNGKTE XALEAT 
QLLLKLLDFQNREE FRRLL YFMAVAANPS EFKLQKES DNRM WK 
RIFSKAI VDNKNLSXGKTDLLVLFL\MDHQKDVFKI PGTL \HKI 
VS\VK\LMAIQNTGRDPNRDAGYIYCQRIDQRDYS1WTEKTTKDE 
LLNLLKTLDEDS KLSAKEKKK\LLGQFYKCHFDI FI EHFGD 


5957 


1479 


4S1 


ELQVAVAMDTLDRVVKPKTKRAKRFLEKREPKLNENI KNAMLI K 
GGNANATVTKVLKDVYALKKPYGVLYKKKNITRPFEDQTSLEFF 
SKKSDCSLFMFGSHNKKRPNNIiVIGRNYDYHVLDMIELGIENFV 
SLKD 1 KWSKCPEGTKPML I FAGDDFDVTEDYRRLKS LLIDFFRG 
PTVSWIRIiAGI^WLHFTALNGKIYFRSYKLLLKKSGCRTPRIE 
LEEMGPSLDLVLRRTKI^DDLYICLSMKMPKALKPKKKKNISHP 
TFGTTYGR IHMQKQDLS KLQTRKM\ KGLKKRPAER I T3DHE KKS 
KRIKKKLMELSQPLLFHCVIjLKRI IKHQS I QSFL 


5958 


1 


3138 


AAALGMLLWFPACQAFNLDVEKLT VYSGPXGSYFG YAVDFHI PD 
ARTAS VLVGAPKANTSQ PD I VEGGAVY YCP WPAEGS AQCRQ I P F 
DTTNNRKIRVNGTKEPIEFKSNQWFG\ATVKA\HKGKSCGPVAP 
LLFTWKNFLKPTPEKGPVGTCYVAIQNFSAYAEFSPCGNSKADP 
EGQGYCQAGFSLDFYKNGDLI VGG PGSFYWQGQVITASVAD 1 1 A 
NYS FKD I LRKLAGE KQTEVAPAS YDDS YLG YS VAAGEFTGDSQQ 
ELVAG I PRGAQNFG YVS I INS YDMTFIQNFTGEQMAS YFG YTW 
VS DVNSDGLDDVLVGAPL FMERE FESNP RE VGQI YL YLQVSS LL 
FRDPQ ILTGTETFGRFGSAMAHLGDLNQDGYNDIAIGVP FAGKD 
QRGKVLIYNGHKDGLNTKPFPKFCQGVWASHAVPSGFGFTLRGD 
SD IDKND YPDL IVGAFG TGKVAVYRARP WTVDAQLLLKPM I IN 
LENKTCQVPDSMTSAACFSLRVCASVTGQSIANTIVLMAEVQLD 
SLKQKGAIKRTLFLDNHQAHRVFPLVIKRQKSHQCQDFIVYLRD 
ETEFRDKLSPINISLNYSLDESTFKEGLEVKPILNYYRENIVSE 
QAHILVDCGEDNLCVPDLKLSARPDKHOVI IGDENHLMLI INAR 
NEGEGAYEAELFVMI PEEADYVGIERNNKGFRPLSCEYKMENVT 
RMWCDLGNP MVSGTNYS LGLRFAVPRLEKTNMSINFDLQIRS S 
NKDNPDSNFVSLQINITAVAQVEIRGVSHPPQIVLPIHNWEPEE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
locatioa 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine f OCysteine, D=Aspartic Acid, Ba 
Glutamic Acid, ^phenylalanine, G^Glycine, 
H-Hist idine , I- 1 soleu cine , K=Ijys ine , 
L=ijeuc ine ♦ M=Me thloni ne , N=Asparagine , 
P=Proline, Q=Glutamine, R=sArginine, 
S-Serine, T=Threonine , V=Valine, 
WsTryptophan, Y«Tyroeine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EPHiCEEEVGPLVSHIYELHNIGPSTISDTILEVGWPFSARDEFli 
LYIFHIQTLGPLQCQPNPNINPQDIKPAASP3DTPELSAFLRNS 
TIPHLVRKRDVHWEFHRQSPAKILNCTNIECLQISCAVGRLEG 
GESAVLKVRSRLWAHTPLQRKWDPYALASLVSFEVKKMPYTDQP 
AKLPEGS 1AI KTSVI WATPNVS FS IPLWVI ILAILLGLLVLAIL 
TlALWKCGFFDRARPPQEDMTDREQIiTNDKTPEA 


595S 


1 


L166 


GTSGYAAQQIiPSIiLKEREFHLGTLNKVFASQWIiNHRQWCGTKC 
NTLFWDVQTSQITKIPILKDREPGGVTQQGCGIHAIELNPSRT 
LLATGGDNPNSLAIYRLPTLDPVCVGDDGHKDWIFSIAWISDTM 
AVSGSRDGSMGLWEVTDDVLTKSDARHNVSRVPVYAHITHKALK 
DIPKEDTNPDNCKVRALAFNNKWKEIiGAVSLDGYFHLWKAENTL 
SKLLSTKLPYCRENVCLAYGSBWSVYAVGSQAHVSFLDPRQPSY 
NVKSVCSRERGSGIRSVSFYEHIITVGTGQGSLLFYDIRAQRFL 
EERLSACYGSKPRLAGENIiKLTTGSKGWLWHDETWRNYFSDIDF 
FPNAVYTHCYDSSGTKLFVAGGPLPSGLHGNYAGLWS 


5960 


2853 


870 


F VWS DGG PRPRRGPAVGAGAAHL S DP WAMT PGT ANRATN P LNKE 
LDWASINGFCEQLNEDFEGPPLATRLLAHKIQSPQEWEAIQALT 
VLETCMKSCGKRFHDEVGKFRFLNELIKWSPKYLGSRTSEKVK 
NKIIjELLYSWTVGLPEEVKIAEAYQML.KKQG\rVECSDPKLPDDT 
TFPLPPPRPKNVIFEDEEKSKMLARLIjKSSHPEDLRAANKI*IKE 
KVQEDQKRMEKISKRVNAIEEVNNNVKLLTEMVMSHSQGGAAAG 
SSEDL\MKEL\YQRCERMRPTLFPTGRVDTEDND\EAI*AEXLC2A 
NDWLTQVINLYKQLVRGEEVNGDATAGSIPGSTSALLDIiSGLDL 
PPAGTTYPAMPTRPGEQASPEQPSASVSIiLDDELMSLGLSDPTP 
P5GPSLDGTGWNSFQSSDATEPPAPALAQAPSMESRPPAQTSLP 
ASSGLDDLDLLGKTLLQQSLPPESQQVRWEKQQPTPRIiTLRDLQ 
taKSSSCSSPSSSATSLLHTVSPEPPRPPQQPVPTELSLASITVP 
LES I KPSNILPVTVYDQHGFRILFHFARDPLPGRSDVLWWSM 
LSTAP Q P I RN I VFQSAVP KVMKVKIiQ P PSGTELPAFNP I VHP S A 
ITQVbliLANPQKEKVRLRYKLTFTMGDQTYNEMGDVDQFPPPET 
WGSL 


5961 


198 


3147 


SGEPRPEPGNMATCIGEKIEDFKVGNLLGKGSFAGVYRAESIHT 
GLEVAIKMIDKKAMYKAGMVQRVQNEVKIHCCLKHPSILELYNY 
FEDSNYVYLVLEMCHNGEMNRYLKNRVKPFSENEARHFMHQI I T 
GMLYLHSHGlLHRDLTLSNLLLTRNMNIKIADFGIiATQLKMPHE 
KHYTLCGTPNYISPEIATRSAHGLESDVWSLGCMFYTLLIGRPP 
FDTDTVKNTLNKVVLADYEMPTFLSIEAKDLIHQLLRRNPADRL 
SLS S VLDHPFMSRNS STKS KDLGT VEDS ID SGHAT ISTAI TAS S 
STSISGSLFDKRRLLIGQPTjPNKMTVFPKNKSSTDFSSSGDGNS 
FYTQWGNQETSNSGRGRVIQDABERPHSRYLRRAYSSDRSGTSN 
SQSQAKTYTMERCHSAEI^LSVSKRSGGGENEERYSPTDNWANIF 
NFFKEKTSSSSGSFERPDNNQALSNHLCPGKTPFPFADPTPQTE 
TVQQWFGNLQINAHLRKTTEYDSISPNRDFQGHPDLQKDTSKNA 
WXDTECVKKNSDASDNAHS VKQQNTMKYMTALHSKPE I IQQECVF 
GSDPLSEQSKTRGM3PPWG YQNRTLRS I TSPLVAHRLKP IRQKT 
XKAWS I LDSEEVCVELVKE YASQE YVKE VLQI SSDGNT ITIYY 
PNGG\RGFPLA\DRPPSPT\DNISR\YSF\DNLPEKYWRKYQYA 
S RFVQL VRSKS P K I T YFTR YAKCILMENS PGADFEVWFYEGVK I 
HKTEDFIQVIEKTGKSYTLKSESEVNSIiKEEIKMYMDHANEGHR 
ICLALES I I SEEERKTRSAPFFPI I IGRKPGSTSSPKALSPPPS 
VDSNYPTRDRASFNRMVMHSAASPTQAPILNPSMVTN3GLGLTT 
TASGTDISSNSLKDCLPKSAQLl.KSVFVKNVGWATQ\liTSGAVW 
VQFNDGSQLWQAGVSS I S YTS PNGQ\ TTR\ YGENEKLPDY I KQ 
KLQCLSS ILLMFSNPTPNFH 


5962 


20 


2447 


RVCSs SASTASQAVMADAWEE IkRtAADFQRAQFAEATQRLSER 
NC IE I VNXLIAQKQLEVVHTLDGKE YI T PAQ ISKEMRDELHVRG 
GRVNIVDLQQVINVDLIHIENRIGDIIKSEKHVQLVLGQLIDEN 
YIiDRIiAEEVNDKLQESGQVTISELCKTYDLPGNFLTQALTQRLG 
RI ISGHIDLDNRGVI FTEAFVARHKARIRGLFSAITRPTAVNSL 
ISKYGFQEQLLYSVIiEELVNSGRLRGTVVGGRQDKAVFVPDIYS 
RTQSTWVDSFFRQNGYLEFDAIiSRLGI PDAVSYI KKRYKTTQLI* 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
peproline, Q^Glutaraine, R=Arginine, 
S~Serine, T=Threonine, V= Valine, 
^Tryptophan, Y- Tyrosine, XoUnknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=pos3ible nucleotide insertion) 








FLKAACVGQGLVDQVEA3VEEAISSGTWVDIAPLLPTSLSVEDA 
AI LLQ QVMRAFS KQAS T WFSDT WVSEKF\ INDCTEL F RELMH 
QKAEKEMKNNPVHLITESDLKQISTLBSVSTSKKDKKDERRRKA 
TBGSGSMRGGGGGNAREYKIKECVKKKGRKDDDSDDESQSSHTGK 
KKPEIS FMFQDE I EDPLRKHIQDAPEEFI SEIAEYLI KPLNK7TY 
LEWRSVFMSSTTSASGTGRKRTIKDLQEEVSNLYNNIRLFEKG 
MKFFADDTQAALTKHIiLKSVCTDITNLIFNFIASDLMMAVDDPA 
AITSE IRKKI LSKLSEETKVALTKLHNSLNE ECSI EDFI S CLDSA 
AEACDlMVKRGDKICRERQrLFQHRQAriAEQLICVTEDPALIIiHLT 
SVLLFQFSTHSMLHAPGRCVPQI IAFLNSKI P EDQHALL VKYQG 
LVVKQLVSQSKKTGQGDY PLiKNEIiDKaQED VASTTRKEltQE uSS 
S I KDLVLKSRKSSVTEE 


5963 


62 


1130 


PWNPQDFPGNRGLMG\QKGE IG PP \GQQGKKGAPGMP\GIjMGSN 
GSPGQPGTPGSKGSKGEPGIQGMPGASGLKGEPGATGSPGEPGY 
MGLPGIQGKKGDKGNQGEKGIQGQKGENGRQGIPGQQGIQGHHG 
AKGERGEKGEPGVRGAIGS KGESGVDGLMG PAGP KGQ PGDPGPQ 
GP PGLDGKPGREFSEQFIRQ VCTDVI RAQLPVIjIiQSGRIRNCDH 
CLSQHGS PGI PG P PGP IGPEGPRGLPGLPGRDGVPGLVGVPGRP 
GVRGLKGLPGRNGEKGSQGFGYPGEQGPPGPPGPEGPPGISKEG 
PPGDPGLPGKDGDHGKPG IQGQPGPPG IODPSLCFS VIARRDP F 
RKGPNY 


" 59^4 


3 


2147 


SCRTRGRLS PLQ P RE AG S SRGS RARS E P PRPGGME EACQ VQTT K 
RGDPHELRNIFIjQYASTEVDGERYMTPEDFVQRYLGI*YWDPNSN 
PKIVQLLAGVADQTKDGLISYQEFLAFESVLCAPDSMFIVAFQL 
FDKSGNGEVTFENVKE I FGQT I IHHHIPFNWDCEFIRLHFGHNR 
KKHLNYTEFTQFLQELQLEHARQAFALKDKSKSGMISGLDFSDI 
MVT I RSHMLTP FVE ENLVS AAGGS I S HQ VSFS YFUAFNS LLNNM 
ELVRKI YSTLAGTRKDAEVTKE EFAQS AI RYGQATP3U2 IDI I*YQ 
IiADLYKASGRIiTLADIERIAPLAEGAIiPYNIaAELQRQQSPGLGR 
P I WLQI AES AYRFTLG S VAGAVGATAVYP IDLVKTRMQNQRGSG 
SWGELMYKNSFDCrKKVLRYEGFFGLYRGLIPQLIGVAPEKAI 
KLTVKDFVRDKFTRRDGSVPLPAEVLAGGCAGGSQVI FTNPLE I 
VKIRLQ VAGEITTG PR VS ALNVLRDLGI FGLYKGAKACFLRD I P 
FSAIYF P VYAHCKLLIiADENGHVGGLNLLAAGAMAG \ VPAASLV 
TPADV I KTRLQVAARAGQTTYSGVIDCFRKI L\RE EGPSAFWKG 
TAARVFRSSPQFG \ VTLVTYELLQRGF YIDFGGL KPAGSEPTPK 
SRIADLPPANPDH IGGYRLATATFAG I ENKFGLYLPKFKSPS VA 
WQPKAAVAATQ 


5965 


1 


1498 


MVTWLYRFLPTSNMAAKIiRSLLPPDLRIiQFWIiHARLQKCFLSRG 
CGS YCAGAKAS PLPGKMAMGLMCGRREI*LRL LQSGRRVHS VAGP 
SQWLGKPLTrRIiLFPAAPCCCRPHYLFLAASGPRSLS TSAIS FA 
EVQVQAP P WAATPS P TAVPEVASGETAD WQTAAE QS FAELGL 
GSYTPVGLIQNIjLBFMHVDLGLPWWGAIAACTVFARCIiIFPLIV 
TGQREAAR IIINHLPE I QKFSSR I REAKLAGDHIEY YKASS EMAL 
YQXKHGI KLYKPLI LPVTQAPI FISFF IALREMANLPVPSLQTG 
GLW WFQDLTVS DP I YI L PLiAVTATMWAVLELGAETGVQSS DLQW 
MRWVIRMMPHTLPITMHFPTAVFMYWLSSNLFSLVQVSCLRIP 
AVRTVLKI PQRWHDLDKLPPREGFLES FKKGWKNAEMTRQLRE 
REQRMRNQLEIAARGPLRQTPTHNPLLQPGKDNPPNI PSS \SSS 
SSKPKSKYPWHDTLG 




in') 




YMSRVHGMHPKETTRQLSIiAVKDGIiIVETLTVGCKGS KAGIEQE 
GYWLPGDE ID WETENHDW YCFECHLPGE Vhl CDLCFRVYHS KCL 
SDEFRLRDSSSPWQCPVCRSIKKKNTNKQEMGTYLRFIVSRMKE 
RAIDLNKKGKDNKHPMYRRItVHSAVDVPTIQEKVNEGKYRSYEE 
FKADAQLLLHNTVIFYGADSEQADIARMLYIODTCHEL\DELQLC 
KNC F VLANAR PDNWFC YPCI PNHELDWAKMKGFGFWP AKVMQKE 
DNQVDVRF FGHHHQRAWI PSENIQDI TVNIHRLHVKRSMGWKKA 
CDELELHQRFLREGRFWKSKNEDRGEEEAESSISSTSNEQLKVT 
QEPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEIPTMPQPIEKV 
SVSTQTKKLS AS S PRKLHRSTQTTNDGVCQSMCHDKYTKIFNDF 



414 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide _ 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pept£d~e 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl a la nine, G«=Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, Rt=Arginine, 
S=Serine, T=Threonine, V^Valine, 
VI tryptophan, Y=Tyrosine, X-=Unknown, *=Stop 
Codon r /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








KDRMKSDHKRETERVVRE ALE KLRS EME E E KRQAVN KAVANWQG 
EMDRKCKQVKE KCKE EFVEE I KJCLATQHKQLI S QTX KKQWCYNC 
EEEAMYHCCWNTSYCS IKCQQEHWHABHKRTCRRKR 


5967 


102 


1925 


RSKQVMARLTKRRQADTKAIQHLWAAIEllRNQKQIANIDRITK 
YMSRVHGMHP KETTRQLSLAVKDGL I VETLTVGCKGS KAGIEQE 
G YWLPGD E IDWETENHDW YC FE CHLPG E VL I CDLC FRVYHS KCL 
S DEFRLRDSSS PWQCPVCRS I KKKNTNKQEMGTYLRFI VSRMKE 
RAIDI1NKKGKDNKHPMYRRI1VHSAVDVPTIQEKVTJEGKYRSYEE 
FKADAQLLLHNTVIFYGADSEQADrARiVlLYKDTCHEX*\DELQLC 
KNCFYLANARPDNWFCYPCIPNHELDWAKMKGFGFWPAKVMQKE 
DNQVDVRFFGHHHQRAWIPSENIQDITVNIHRLHVKRSMGWKKA 
CDELELHQRFLREGRFWKSKNEDRGEEEAESSISSTSNEQLKVT 
QEPRAKKGRRNQ S VE P KKE EPE PETEAVS SSQE I PTM PQP I EKV 
SVSTOTKKLSASSPRMLHRSTQTTNOGVCQSMCHDKYTKIFNDF 
KDRMKSDHKRETERVVREALEKLRSEMEEEKRQAVNKAVANMQG 
EMDRKCKQVKEKCKEEF VEEI KKLATQHKQl, ISQTKKKQWCYWC 
EEEAMYHCCWNTSYCS IKCQQEHWHAEHKRTCRRKR 


5968 


81 


1288 


VRFPRRGGAPPTVLTPGRQQGVFLGPQRPGSEPDIPARGQPHPP 
RPVGVSTS AQ AQVQ P PAMHRRRLALGLG FCLLACT S I*S VtiWV YIj 
ENWLPVS YVPY YLPCPE I FNMKLH YKREKP I.QPWWSQ YPQPKL 
LEHRPTQLLTLTPWLAPIVSEGTFNPELLQH1YQPLNLTIGVTV 
FAVGN/HFLES AEE FFKRG YRVH YYI FTDNPAAVPGVPLGPHRI* 
LSS 1 P I QGHSHWEETSMRRMETISQHI AKRAKRE VDYXj FCLD VD 
MVFRNPWGPETLGDLVAAIHPSYYAVPRQQFPYERRRVSTAFVA 
DSEGDFYYGGAVFGGQ VARVYEFTRGCHMA I LADKANG IMAAWR 
EESHLNRHFISNKPS KVLS P E YLWDDRKPQPPSLKLIRFSTLDK 
DISCLRS 


5969 


1126 


503 


DVGFNI KRKRCDLDVFLES PRKPSGRRDRAPEKQRRIAANKCLC 
TGVREGEPPS/TTSQKVKEAGRDFTYLIWIjFGISITGGIjFYTI 
FKBLFSSSS PSKI YGRALE KCRSHPEVI GVFGE SVKG YGE VTRR 
GRRQHVRFTEYVKDGLKHTCVKFYIEGSEPGKQGTVYAQVKENP 

gsgeydfryifveiesyprrtiiiednrsqdd 


5970 


316 


4712 


sqdnighrllqkhgwklgqglgkslqgrtdpipiwkydvmgmg 
rmemkldyaedaterrrvlevekedteelrqkykdyvdkekaia 
kaledlranfycelcdkqyqkhqefdnhinsydhahkqrlkdlk 

QREFAPJTVSSRSRKDEKKQEKALRRIiHEIiAEQRKQAECAPGSGP 

mfkpttvavdeeggeddkdesatnsgtgatas cglgsefstdkg 
g p ftavq i tnttglaqapglas qg i s pg i knnlgt plq klgvs f 
sfakkapvklesiasvfkdhaeegtsedgtkpdekssdqglqkv 
gdsdgssnldgkkededpqdggslastls klkrmkreegagate 
peyyhyippahckvkpwfpfllfmraseqmdgdntthpkkapes 
kkgss pkpks ci kaaasqgaektvs e vseqpketsmtepsefgs 
kaeakkalggdvsdqsleshsqkvsetqmcesnssketslatpa 
gkesqegpkhptgpffp vlskdes talqwps elli ftkaeps is 

YSCNPLYFDFKLSRNKDARTKGTEKPKDIGSSSKDHIjQGLDPGE 

pkkskevggekivrssggrmdapasgsacsglnkqepggshgse 
tedtgrslpskkersgkshrhkkkkkhkksskhkrkhkadteek 
sskaesgekskkrkkrkrkknkssapadsergpkpeppgsgspa 
pprrr rraqddsqrrs lpaeegs sgkkdeggggss sqdhggrkh 
kgelppsscqrragtkrssrsshrsqpssgdedsddasshriihq 
kspsqyseeeeeedsgsehsrsrsrsgrrh5shrssrrsyssss 
dassdqscysrqrsysddsysdysdrsrrhskrshdsddsdyas 

SKHRSKRHKYSSSDDDYSLSCSQSRSRSRSHTRERSRSRGRSRS 
SSCSRSRSKRRSRSTTAHSWQRSRSYSRDRSRSTRSPSQRSGSR 
KRSWGHESPEERHSGRRDFIRSKIYRSQSPHYFRSGRGEGPGKK 
DDGRGDDSKATGPPSQNSNIGTGRGSEGDCSPEDKNSVTAKLLL 
EKIQSRKVERKPSVS EEVQATPNKAGPKL KDFPQGYFGPKLP PS 
LGNKPVLPLIGKLPATRKPNKKCEESGL3RGEEQEQSETEEGPP 
GSSDALFGHQFP\SEETTGPLLDPPPEESKSGEVTADHPVAPLG 
PPAHFDCYLGDPT1SHNYLPDPSDGNTLESLDSSSQPGPVESSL 
LPIAPDLEHFPSYAPPSGDPSIESTDGAEDA\SIAPIiESQPITF 
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SEQ 
ID 
NO: 


Predicted 
begr inning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid ■ 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D*Aspartic Acid/ E= 
Glutamic Acid, F* Phenyl alanine, G=Glycine, 
HsHistidine, I=Isoleucine , K«Lysine, 
L^Leucine, M -Methionine, N=Asparagine, 
P=Pxoline, Q=Glut amine, R=Arginine, 
S=Serine, T-Threonine, V=Valxne, 
W^Tryptophan, Y=Tyrosine, X«Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TPEEHEK YS KLQQAAQQHI QQQLLAKQ VKAF PAS AALAPAT PALj 
QPIHIQQPATASATSITTVQHAIJjQHHAAAAAAAIGIHPHPHPQ 
PLAQVHHIPQPHLTPI SLSHLTHSI I PGHPATFLASHPIHI I PA 
SAIHPGPPTFHPVPHAALYPTLLAPRPAAAAATALHLHPLLHP r 
FSGQDLQHPPSHGT 


5371 


53 


2149 


S FL YFVG VDMDNP I GNWDGRFDG VQLCS FACVESTILLHIND 1 1 
PESVTQERRPPKLAFMSRGVGDKGSSSHNKPKATGSTSDPGNRN 
RS3LFYTLNGSSVOS QPQSKSKNTWY IDEVAEDPAKSLTE I STD 
FDR3SPPLQ PPPVNSLTTENRFHSLPFSLTKMPNTNGS IGHS PL 
SLSAQS VMEELNTAP VQES P PLAMPPGNSHGLEVG S LAE VKENP 
P F YG VIR W I GQP PGLNEYLAGLELEDB CAG\ CTDGTF/ REGTR Y 
FTCALKKALFVKLKSCRPDSRFASLQPVSNQIERCNSLAIWEAY 
LSE WEENTP TQKWEKEGLE I M I G \KKKGIQGHYNS CYLDSTLF 
CLFAFSS VLDT VLLRP KEKNDVE Y YS ETQELLRTE 1 VNPLR I YG 
YVCATKIMKLRKILEKVEAASGFTSEEKDPEEFLNILFHHILRV 
EPLLKIRSAGQECVQDC YFYQ I FMEKNEKVGVPTI QQLLEWS FIN 
SNLKFAEAPSCLillQMPRFGKDFKLFKKIFPSLELNITDLLEDT 
PRQCRICGGLAMYECREC YDDPD I SAGKIKQFCKTCNTQVHLHP 
KRLNHKYNP VS LPKDLPDWDWRHGCI PCQNMELFAVLC I ETSH Y 
VAFVKYGKDDSAWLFFDSMADRDGGQNGFNIPQVTPCPEVGEYIj 
KMSLEDLHSLDSRRIQGCARRLLCDAIYVPCTQSPTMSLYK 


5972 


440 




E PFR EE LAYD RM PTLERGRQD PAS YAPDAKPSDLQLS KRLPPCF 
SHKTWVFSVLMGSCLLVTSGFSLYLGNVFPAEMDYLRCAAGSCI 
PSAI VSFTVSRRNANVI PNFQILFVS TFAVTTTCLI W FG CKLVL 
NPSAININFNLILLLLLELLMAATVI IAARSSEEDCKKKKGSMS 
DS AN I LDEVP FPARVLKS YS WE VI AGI S AVLGG I IALNVDDSV 
SGPHLSVTFFWILVACFPSAIASHVAAECPNKCLVEVLIAISSL 
TS PLLFTASGYLSFSIMR I VEMFKDY?PAI KPS YDVLLLBLLLV 
LLLQA/ GPQHGHRHPVRALQGQC KAAGC ILGHPER PAGAPG WGG 
GQBPPEGVRQGESLESRRGANGPVTPRRGNRVAAPSLAPGMETH 
NP 


5973 


65 


- 2007 


NGDGKDLFGHI WAWRSNG 1 1 SKFFRRS PHAGMAH DE PDAKS P KTG" " 
GRAP PGGAEAG EPTTLLQRLRGT I SKAVQNKVEGILQDVQKFSD 
NDKLYLYLQLPSGPTTGDKSSEPSTLSNEEYMYAYRWIRNHLEE 
HTDTCLPKQSVYDAYRKYCESLAGCRPLSTANFGKI IRE I FPDI 
KARRLGGRGQSKYCYSGIHRKrLVSMPPLPGLDLKGSESPEMGP 
E VTPAPRDELVEAACALTCDWAERI LKRS FSS IVEVARFLLQQH 
LISARSAHAHVLKAMGLAEEDEHAPRERSSKPKNGLENPEGGAH 
KKPERLAQPP KDLEARTGAGP LARG ERKKS WES SAP GANNLQV 
NALVARLPLLLPRAPR5LI PPI PVSPPILAPRLSSGALKVATLP 
LSSRAGAPPAAVP I INMILPTVPALPGPGPGPGRAPPGGLTQPR 
GTENREVGIGGDQGPHDKGVKRTAEVPVSEASGQAPPAKAAKQD 
lEDTASDAKRKRGRPLKKSGGSGERNSTPLKSAAAMESAQSSRL 
P WETWGS GGEGNS AGGAERPG PMGEAEKGAVLAQG\ QGDGT VS K 
GGRGPGS QHTKEAEDKI PLVPS KVSVIKGSRSQKEAFPLAKGEV 
DTAPQGNKDLKEHVLQSSLSQEHKDPKATPP 


5974 


4293 


2200 


LGLQMHTTSGRIHQAMVTSLNEDNESVTVEWIENGDTKGK\EI0 
LESIFSLNP\DL\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 
TV \ AS I KWDPPS \ RDNRWGSARARPSQFPEQFSS AQQNGS V\S 
DIS P VQAAKKEFGPPSRR KSNC VKEVE KLQE KREKRRLQQQELR 
EKRAQDVDATNPNYE IMCMIRDFRGSLDYRPLTTADP IDEHRIC 
VCVR KRFLNKKETQMKDLDVI T I PSKD WMVHE PKQ KVDLTR YL 
ENQTFRFDYAFDDSAPNEMVYRFTARPLVET I FERGMATCFAYG 
QTGSGK^HTMGGDFSGKNQDCSKGIYALAARDVFIiMLKKPNYKK 
LELQVYATFFEI YSGK^FDlXNRKTKLRVLEDGKQQVQVVGIiQE 
REVKCVEDVLKLID IGNSCRTSGQTSANAHSSRSHAVFQI ILRR 
KGKIJiGKFSLIDLAGNERGADTSSADRQTRLEGAEINKSLLALK 
ECIRALGRNKPHTPFRASKLTQVLRDSFIGENSRTCMIATISPG 
MASCENTLNTLRYANRVKELTVDPTAAGDVRPIMHHPPNQI\DD 
LBTQWGVGSS PQRDDLKLLCEQNEEEVSPQLFTFHEAVSQMVEM 
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Predicted end 
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corresponding 
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residue of 
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Amino acid segment containing signal peptide 
(A=sAlanine, C=Cysteine, D=ABpartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G-Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T- Threonine, V^Valine, 
^Tryptophan, Y»Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\epossible nucleotide insertion) 








EEQWEDHRAVFQESIRWLEDEKALLEMTEEVDYDVDSYATQLE 
AIUSQKID I LTELRDKVKS FRAAU2EEEQASKQ INPKRPRAL 


5975 


4293 


2200 


LGLQMHTTSGRIHQAMVTS LNEDNESVT VEWIENGDTECGK\ E ID 
I;ESIFSLNP\DL\VPD3EIEPSP\ETPPPPASSAKVNKIVKNRR 
?V\ASIKNDPPS\RDNRWGSARARPSQFPEQFSSAQQNGSV\S 
DISPVQAAKKEFGPPSRRKSNCVKEVEKLQEKREKRRIjQQQEIjR 
EKRAQDVDATNPNYEI MCMIRD FRGSLDYRPLTTADP I DEHRI C 
VCVRKRPLNKKETQMKDLDVITIPSKDWMVHEPKQKVDLTRYL 
ENQTFRFJDYAFDDSAPNEMVYRFTARPLVETIFERGMATCFAYG 
QTGSGKTHTMGGDFSGKNQDCS KG I YALAARD VFLMLKKPN Y KK 
LELQVYATFFEIYSGKVFDI.TiNRKTKLRVLEDGKQQVQVVGLQE 
REVKCVEDVLKLI D IGNSCRTSGQTSANAHS SRSHAVFQ 1 ILRR 
KGKLHG KFS L I DLAGNERGADTS SADRQTRLEGAEINKS LLALK 
E C I RALGRNKPHTPFRAS KLT QVLRDS F I GENSRTCM I ATI S PG 
MASCENTLWTIiRYANRVKELTVDPTAAG0VRPIMHKPPNQl\DD 
LETQWGVGSSPQRDDLKLLCEQNEEEVS PQLFTFHEAVS QMVEM 
EEQWEDHRAVFQES I RWLEDEKALLEMTEEVD YD VDS YATQLE 
AI LEQKI D I LTELRDKVKS FRAALQEEEQAS KQ IN P KR PRAL 


5976 


20 


2949 


VHHLHLTRVSVWNLDI I LR I AQQMGI KTLNL VLG \LKRA\LEF 
PEVSWMEVKDPNMKGAMLTNTGKYAI PTIDA\EAYAIGKKEKP P 
FLPEEPSS S SEEDDPI PDBLLCLICKDIMTDAWIPCCGNS YCD 
E C IRTALLESDEHTCPT CHQND VS PDAL I A&KFLRQAVNNFKNE 
TGYTKRLRKQLPSPPPPI PPPRPLIQRNLQPLMRSP I SRC3QDPL 
M I P VTSSSTHPAPS ISS LTSNQSSLAPPVSGNPSSAPAPVPDI T 
ATVSISVHSEKSDGPFRDSDNKI LPAAALASEHSKGTSS IAITA 
LMEEKG YQVP VLGTPS LLGQSLLHGQIiI PTTGPVRINTARPGGG 
RPG WEHSNKLGYLVSP PQQ IRRGERS CYRS I NRGRHHS E RS QRT 
CGPSLPATPVFVPVPPPPLYPPPPHTLPLPPGVPPPQFSPQFPP 
GQP\PPAGYSV?PPGFPPAPANLSTPWVSSGVQTAHSNTIPTTQ 
APPLSREEFYREQRRLKEEEKKKSKLDEFTNDFAKELMEYKKIQ 
KERRRSFSRSKSPYSGSSYSRSSYTYSKSRSGSTRSRSYSRSFS 
RSHSRS YSRS P P YPRRGRGKSRN YRSRSRSHGYKR S RS RS P P YR 
RYHSRS RSPQAFRGQSPNKRNVPQGETEREYFNRYREVP PPYDM 
KAYYGRSVDFRDPFEKERYREWERKYREWYEKYYKGYAAGAQPR 
PS ANRENFS PERFLPLNIRNS PFTRGRREDYVGGQSHRSRNIGS 
NYPEKLSAHDGHKFQKDNTKSKEKESEWAPGDGKGNKHKKHRKRR 
KGEBSEGFLNPELLETSRKSREPTGVEENKTDSIjFVLPSRDDAT 
PVRDE PMDAES ITFKSVSEKDKRERDKPKAKGDKTKRKNDGSAV 
SKKENIVKPAKGPQEKVDG\DVRDLLDIiWL\QLKKPKEETPKDL 
TILNHHIiPLRRMKKSIi \ E P P\ EKI*TLNQQK\TPRNKTSQRG KSE 
EGLFQRCQIRKANN 


5977 


1363 


1336 


FLEDRGQVLSHFQCLSLHS INHILHPGAGVAAGPATGW /REYLT 
PVLKES KFKE TGV I TPEEFVAAGDHLVHHCPTWQWATGEELKVK 
AYLPTGKQFLVTKNVPCYKRCKQMEYSDELEAIIEEDDGDGGWV 
DTYHNTG I TG I TEAVKE ITLENKDNIRLQDCSALCEEEEDEDEG 
EAADMEEYEESGLLErTDEATLDTRKIVEACKAKTDAGGEDAIIjQ 
TRTYDLYITYDKYYQTPRLWLFGYDEQRQPLTVEHMYEDISQDH 
VKKTVTIENHPHLPPPPMCSVHPCRHAEVMKKIIETVAEGGGEL 
GVHMYLLI FLKFVQAVI PTI E YDYTRHFTM 


5978 


160 


3213 


RDGARRWGGCQ S PLT WAPG F YRRFDLATSGRRLRGQTAEPAGRQ 
RPRREPEAMDEQSVES IAEVFRCFI CMEKLRDARLCPHCSKLCC 
FSCIRRWLTEQRAQCPHCRAPLQLREIiVNCRWAEEVTQQIiDTLQ 
LCSLTKHEENE KDKCENHHEKLS VFCWTCKECCICHQCAIiWGGMK 
GGHTFKPLAE I YEQH VTKVNE EVAKLRRRLME LIS LVQE VERNV 
EAVRNAKDERVREI RNAVEMM I ARLDTQLKNKLI TLMGQKTSLT 
Q ETELLE SLLQE VEHQLRS CS KS EL IS KS S EI LMMFQQ VHRKPM 
ASFVTTPVP PDFTSELVPS YDSATFVLENFSTLRQRAD PVYS PP 
LQVSGLCWRLKVYPDGNGWRGYYLS VFLELS AGLPETS KYEYR 
VEMVHQS CNDPTKNI IREFASDFEVGECWG YNRFFRLDLLANEG 
YLNPQNDTVILRFQVRS PTF FQKSRDQHWYI TQLEAAQTSYIQQ 
INNLKERLT I E LSRTQKSRDLSP PONHLS PQNDDALETRAKKS A 
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Amino acid segment containing signal peptide 
{A=Alanine, CoCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F:=Phenylalanine, G=Glycine, 
H=Histidine / I *= Is ©leucine, K=Lysine, 
L«Leucine, M=Methionine, N=Asparagine , 
P** Proline, Q=Glutamine, R=tArginine, 
S= Serine, T>Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








CS DMLLER \G P YSAS \ VR E AKEDEEDEE KI QNEDYHKEIjSDGDL 
DLDLVYEDEVNQLDGSSSSASSTATSNTEENDIDEETMSGENDV 
EYNNMELEEGELMEDAAAAGPAGSSHGYVGSSSRISRRTKLCSA 
ATS SLLD IDP L I LIHLLDL KDRSS I ENL WGLQ PRPPAS LLQ PT A 
SYSRKDKDQRKQQAJ^RVPSDLKMIiKRLKTQMAEVRCMKTDVICN 
TLS E I ECS 55AASGOMQTSL FSADQAAIAACGTENSGRLQ DLGME 
LLAKSSVAMCYIRNSTNKKSNSPKPARSSVAGSIiSLRRAVDPGE 
NSRSKGDCQTLSEGSPGSSQSGSRHSSPRALIHGSIGDILPKTE 
DRQCKALDS DAVWAVFSG LPAVEKRRKMVTLGANAKGGHLEGIi 
QMTDLENNSETGE LQP VLPEGASAAPEEGMS SDS DIE CDTENEE 
QEEHTS VGG FHDS FM VMTQPPDEDTHS S PPDGEQ I GPEDL S FNT 
DBNSGR 


5979 


212 


3665 


1>P DM'l'M YJbW h KLLAFGF AFLDTE VFVTGQS PTPS PTDAYLNASE 
TTTLS P SGS AVI S TTT I ATTP S KPTCDEKYANI TVD YL YNKETK 
LFTAICLNVTTENVEGGNNrCTNNEVHNLTECKNASVSISHNSCTA 
PDXTLI LDVPPGVEKVPVHCCS \QVEQPDS T I WLKWKNI ETSTC 
DTQNITYRFQCGNMIFDNKEIKIiENLEPEHEYKCDSEILYWSHK 
FTNASKI IKTDFGSPGEPQI I FCRSEAAHQG VITWNPPQRS FHN 
FTLCYI KETEKDCLNLDKNIjI KYDfcQNLKP YTKYVLSLHAYI I A 
KVQRNGSAAMCHFTTKSAPPSQVWNMTVSMTSDNSMHVKCRPPR 
DRNGPHERYHLEVEAGNTLVRNESHKWCDFRVKDLQYSTDYTFK 
AYFHNGD YPGE P F I LHHSTS YNS KAL I AFLAFL 1 1 VTS I ALLW 
LYKIYDLHKKRSChn^DEQQELVERPDEKQLMNVEPIHADILLET 
YKRK I ADEGRLFLAE FQS I PRVFS K FP I KEARKP FNQNKNRYVD 
ILPYDYNRVELSEINGDAGSNYINASYIDGFKEPRKYIAAQGPR 
DETVDOFWRMIWEQKATVIVMVTRCEEGNRNKCAEYWPSMEEGT 
RAFGECCCKDLTKHKRCP\DYIIQKLNIVNKKEKATGREVTHIQ 
FTSWPDHGVPEDPHLirjLKLRRRVNAFSNFFSGPIWHCSAGVGR 
TGTYIG IDAMLEGLEAENKVDVYGYWKLRRQRCLMVQVEAQY I 
JjIHQAIiVEYNQFGETEVNLSEIjHPYLHNMKKRDPPSEPSPIiEAE 
FQRLPSYRSWRTQHIGNQE\ENKSKNRNSNVIPYDYNRVPLKHE 
L EMS KE S EHDSDES SDDDSDSEEPS K YI NAS F IMS YWKP \EVM I 
AAQGPLKETIGDFWQMIFQRKVKVIVMLTELKHGDQEICAQYWG 
EGKQTYGDIEVDLKDXDKSSTYTLRVFELRHSKRKDSRTVYQYQ 
YTNWSVEOLPAEPKELISMIQVVKQKIjPQKN'SSEGNKHHKSTPL 
LIHCRDGS QQTGI FCALLNLLE S AETEE WDI FQWKALRKARP 
GMVSTFEQYQFLYDVIASTYPAQNGQVKKNNHQEDKIEFDNEVD 
KVKQDANCVKPLGAPEKLPEAKEQAEGSEPrSGTEGPEHSVNGP 
ASPALNQGS 


$980 


3 


2363 


DAWGCKLRRLRFT YGTQTR VS LA LPGQ YEL VHTL VAHQGNWET I 
PEEDIjB VQENNE DAAHDLTELE VTMHHAIiLQ EVDVWAPCQGIjR 
PTVDVLGDLVNDFLPVITYALHKDELS ERDEQELQE IRKYFS FP 
VFFFKVP KLGSE 1 1 DS STRRMESERS P LYRQLI DLG YLS SSHWN 
CGAPGQDTKAQSMLVEQSEKLRHLSTFSHQ7LQTRLVDAAKALN 
LVHCHCLDIFIKQAFDMQRDLQITPKRLEYTRECKENELYESLMN 
IANRKQEEMKDMIVETLNTMKEELLDDATNMEFKDVIVPENGEP 
VGTRE IKCCIRQ IQELI ISRLNQAVANKLI SS VDYLRES FVGTL 
ERCLQSLEKSQDVSVHITSNYLKQILNAAYHVEVTFHSGSSVTR 
MLWEQIKQ I IQRITW VS PPAI TLEWKRKVAQEAIESLSAS KLAK 
S I CSQ FRTRLNS SHEAFAASLRQLE AGHSGRLEKTEDLWLR VR K 
DHAPRLARLS LES RS LQ D VLLHRKPKIiGQ ELGRGQYGVVYLCDN 
WGGHFP CALKS VVPPDEKHWNDLALEFKYMRSLPKHERLVDLKG 
SVIDYNYGGGSSIAVLLIMBRLHRDLYTGLKAGLTLETRLQIAL 
DVVEGIRFLHSQGLVHRDIKLKNVLLDKQNRAKITDLGFCKPEA 
MMSGS I VGTPI HMAPELFTGKYDNS VD VYA FGIIjFW YI CSGSVK 
LPEAFERCAS KDHLWNNVRRGAR PERLP VFDEE CWQLME ACWDG 
DPLKRPLLGIVQPMLQGIMNRLCKS\NSEQPWRGLDDST 


5981 


1 


2S19 


GRkHSAAMERPWGAADGLSRWPHGLGLLLLLQLLPPSTLSQDRL 
DAPP PPAAPL PRWSGP IG VS WGLRAAAA\GGAFPRGGRWRRS AP 
G VEDEECGRVRDFVAKjLANNTHQHVFDDLRGSVSLSWVGDSTGV 
ILVLTXFHVPLVIMTFGQSKLYRSEDYGKNFKDITDLINNTFIR 
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Predicted end 
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location 
corresponding 
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Amino acid segment containing signal peptide"" 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidinc , 1=1 so leucine, K=Lysine, 
L»Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S -Serine, T-Threonine, V^Valine, 
W^Tryptophan, Y«Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TEPGMAIGPEKSGKWLTAEVSGGSRGGRIFRSSDFAKNFVQTD 
LP PHPLTQMM YSPQNSDYLLAIiS XENGLWSKNFGGKWEE IHKA 
VCLAKWGSDNTIPFTTYANGSCECADLGALBLWRTSDLGKSFKTI 
GVKXYSFGLGGRFLFASVMADKDTTRRIHVSTDQGDTWSMAQLP 
SVGQEQFYSILAANDDMVFNHVDSPGDTGFGTIFTSDDRGIVYS 
KSLDRHLYTTTGGETD FTNVTSLRGVYI TSVLSEDNS IQTMITF 
DQGGRWTHLRKPENSECDATAKNKNECS LHIHASYS I SQKLNVP 
MAPLSE PNAVGI VIAHGSVGDAI SVMVPDVYI SDDGGYSWTKML 
EGPHYYT I LDSGG 1 1 VAIEHS S R P INVI KFSTDEGQC WQTYTFT 
RDP I YFTGLASEPGARSMNIS I WGFTES FLTSQWVS YTI DFKDT 
LERNCEEKDYTIWIiAHSTDPEDYEDGCILGYKEQFIiRLRKSSVC 
QNGRDYWTKQPSICLCSLEDFLCDFGYYRPSNDSKCVEQPELK 
GHDLEFCLYGREEHLTTNGYRKIPGDKCQGGVNPVREVKDLKKK 
CTSNFLSPEKQNS KSNS VP1 ILAI VGLMLVTWAGVLI VKKY VC 
GGRFLVHI*YS VLQQH \AEA\ NG VDG VDALDTASHTNKSG YHDD3 
DEDIjLE 


5982 


56 


2316 


ATRPPRGSSWCRQFSRTASAAPGRSNMLRI PVRKALVGLSKSPK " 
GCVRTTATAASNLIEVFVDGQSVMVEPGTTVIiQACEKVGMQIPR 
FC YHERLS VAGNCRMCLVEI EKAP KWAACAMPVM KGWNT LTNS 
EKSKKAREGVMEFIiliANHPLDCP I CDQGGECDLQDQSWMFGNDR 
S R FL EGKRAVEDKNTIGPLVKT I MTRC IQCTRC I RFASEI AGVDD 
LGTTGRGNDMQVGT YI EKMFMSELSGNI IDI CPVGALTSKP YAF 
TARPWETRKTESIDVMDAVGSNIWSTRTGEVMRILPRMHEDIN 
EEW I SDKTRFAYDGLKRQRLTEPMVRNEKGLLT YTS WEDALSRV 
AGMLQS FQGKDVAAI AGGLVD AEAL VAL KDIjLNRVDSDTL CTE E 
VFPTAGAGTDLRSNYBLNTTIAGVEEADWLIiVGTNPRFEAPtiF 
NAR IR KSWLHNDLKVAL IG S PVDLTYT YDHLGD S PK I LQD I ASG 
SHP FSQVLKEAXKPM WLGSSALQRND3AAILAAVSS IAQKIRM 
TSGVTGDWKVTWILHRIASQVAALDLGYTCPGVEAIRKNPPKVLF 
LM5ADGGCI TRQDLPKDCFI IYQGHHGDVGAPIADVILPGAAYT 
EKSATYVNTEGRAQQTKVAVrPPGLAREDWKIIRALSEIAGMTL 
PYDTL \DQ VRNR LEEVS PNLVRYDDI EG\ANYFQQANELS KL VW 
QQLLADPLVPPQLTMKDFYMTDS I SRASQTMAKCVKAVTEGAQA 
VEEPSIC 


5983 


248 


1763 


eargdggrrrhrasgrragrgep\aglksqgqravpkravargg 
rqXysaaiallepagseiaddlsilysnraacylkegncsgciq 

DC^JRAL3LHPFSMKPLLRRAMAYETLEQYGKAYVX)YKTVLQIDC 

glqlandsvnrlsrilmeldgpnwreklslipavpasvplqawh 
pakemiskqagdssshrqqgitdektfkalkeegnqcvndknyk 

DALS KYS ECLKINNXECA I YTNRALC YIiKIiCQFEEAKQDCDQAL 
QLADGNVKAFYRRAIAHKGLKNYQKSL I DLNKVI LLDPS 1 1 EAK 
MELEEVTRLLNLKDKTAPFNKEKERRKI E IQE VNEGKEEPGRPA 
GEVSTGCLASEKGGKSSRS PEDPEKLPIAKPNNAYEFGQI IKAL 
STRKDKEACAHLLAITAPKDLPMFLSNKLEGDTFLiIiLIQSLKNN 
LIEKDPSLVYQHLLYLSKAERFKMMLTLISKGQKELIEQLFEDL 
SDTPNNHFTLEDIQALKRQYEL 


5984 


75S 


1193 


S SVCMACTYVSNLG KKQRSVS FLASGLMRVSTGPELRLHHSFVL "' 
TGDVGRR ICRLLVGIiFTKGDTSSKRVHPFS PGPCFIibCDLARVG 
SSPKINVSPFYQN\QTSTQRSCTVFVWQRCSLVGPFQVTVFTMY 
FHHSLRSISRFSSG 


5985 


22 


1408 


RRPNPS IPSAAAGMSHIQI PPGLTELLQGYTVEVLRQQPPDLVE 
FAVEYFTRLREARAPAS VLPAATPRQS IjGHPPPEPG PDRVADAK 
GDSESEEDEDLEVPVPSRFNRRVSVCAETYNPDEEEEDTDPRVI 
HPKTDEORCRLQEACKDIIiI»FKMLDQEQLSQVC»DAMFERIVKAD 

EHVIDGGDDGDNFYVI ERGTYDI lvtkdnqtrsvgqydnrgsfg 
ELALM YNT PRAAT I VATS EGS LWGLDRVTFRRI I VKNKAKKRKM 
FESFIESVPLLKSLEVSERMKIVDVIGEKIYKR/DGERIITQGE 
K\ADSFYI ISSGEVS ILIRSRTKSNKDGGNQEVEIARCHKGQYF 
GEIALVTNKPRAASAYAVGD VKC UVMDVQAFERIiIiGP CMDIMKR 
NISHYEEQLVKMFGSSVDLGNLGQ 
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Predicted end 
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Amino acid segment containing signal peptide 
<A=Alanine, C=Cyeteine, D^Aspartic Acid, E- 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S -Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5986 


1806 


484 


DAWKSTS LTFHWKL WGRHRGRRRGIAH P KNHLS PQQGG ATPQ V P 
SPCCRFDS PRGPPPPRLGLLGALMAEDGVRGS PPVPSGPPMEED 
GXjRWTPKSPLDPDSGLLSCTLPNGPGGQSGPEGERSLAPPDASI 
LISNVCSIGDHVAQEI.PQGSDLGMAEEAERPGEK\AGQHSPLRE 
EHVTCVQSILDEFLQT\ YGSLI P LS TDE WEKLED I FQQEFSTP 
SR KGLVLQLI QS YQRMPGNAMVRGFRVAYKRHVLTMDDLGTL YG 
QNWLi*IX3 VMNM YGDL VMDT VP E K \ VHF FNS F FY \DKIjRTKG YDG 
VKRWTKNVDIFNKELLLIPIHLEVHWSLISVDVRRRT1TYFI)SQ 
RTLNRRCPKHIAKYLQAEAVKKDRLDFHQGWKGYFKMWVARQNN 
DSDCGAFVLQYCKHLALSQPFSFTQQDMPKLRRQIYKELCHCKL 
TV 


5987 


1805 


484 


DAWKSTSLTFHWKLWGRHRGRRRGLAHPKNHLSPQQGGATPQVP 
S P CCRFDS PRG P PP PRLGLLGALMAEDGVRGS P P VPSGPPMEED 
GLRWTPKSPLDPDSGLLSCTLPNGFGGQSGPEGERSLAPPDASI 
LI SWVCS IGDHVAQELFQGSDLGMAEEAERPGEK\ AGQH5 PLRE 
EHVTCVQSI LDEFIiQT\ YGSL I PLSTDEWEKLEDI FQQE FSTP 
SRKGLVU3LIQSYQRMPGNAMVRGFRVAYKRHVLTMDDLGTLYG 
QN V'H^NDQVMNMYGDLVMDTVPE K\ VHFFNS FFY\DKLRTKG YDG 
V KRWTKNVD I FN KELLL T P IHLEVHW <3 LT <3 VD VRRTJTT T vcn Qn 
RTLNRRCPKH lAKYLQAEAVKKDRJUD FHQGW KGYFKMNVARQNN 
DSDCGAFVLQYCKHLALSQPFSFTQQDMPKLRRQIYKELCHCKL 
TV 


5988 


1292 


410 


FKKYFLS FLGLLESSHSRDRI HNLVLMFIiLATHNLVWWFTCRFQ 
RLDCI YLNAG I MPNPQLNI KAIiLFGLFS\ AEGLLTQGDKI TADG 
LQEVFETDVFGHFILIRELEPl^CHSDNPSQLIWTSSRNARKSN 
FSLED FQHS KGKE P YSS S KYATDLLS VALNRNFNQQGL YSNVAC 
PGTALTNLT YG I LPPF I WTLLMPAI LLLRFFANAFTLTP YNGTE 
ALVWLFHQKPBSLNPL I KYLS ATTGFGRNYIMTQKMDLDBDTAE 
KFYQKLLELEKHI RVTIQjKTDNOARLSGS CL 


5989 


194 


2610 


AMDFPQHSQHVLEQLNQQRQLGIaLCDCTFWDGVHFKAHKAVLA 
ACSEYFKMliFVDQKDWHLD I SNAAGLGQVLEFM YTAKLS I*S PE 
NVDDVL\AVATFIiQMQDI I TACHALKSLAEPATSPGGNAEAIiAT 
EGGEKRAKEEKVATSTLSRLEQAGRSTPIGPSRDLKEERGGOAQ 
SAASGAEQTEKADAPREPPPVELKPDPTSGMAAARAEAALSESS 
EQEMEVEPARKGEEEOKEQEEQEEEGAGPAEVKEEGSQIiEWGEA 
PEENENEESAGTDSGQELGSEARGLRSGTYGDRTESKAYGSVIH 
KCEDCGKEFTHTGNFKRHIRIHTGEKPFSCRECSKAFSDPAACK 
AHEKTESPLKPYGCEECGKSYRLISLLNLRKKRHSGEARYRCED 
CGKLFTTS GNLKRHQLVHSGEKP YQCD YCGRSFSD PTS KMRHL E 
TKDTDKEHKCPHCDKKFNQVGNLKAHLKIHIADGPLKCRECGKQ 
FTTSGNLKRHLRIHSGEKPYVCIHCQRQFADPGALQRHVRIHTG 
EKPCQCVMCGKAFTQASSLIAHVRQHTGEKPYVCERCGKRFVQS 
SQLANHI RHHDNIRPHKCS VCS KAFVNVGDLS KH 1 1 1 HTGEKP Y 
LCDKOGRGFNRVDNLRSIIVKTVHQGKAGIKIL^PEEGSEVSWT 
VDDMVTIATEAIjAATAVTQLTWPVGAAVTADETEVLKAEISKA 
VKQVQEEDPNTH ILYACDS CGDKFLDANS LAQHVRIHTAQALVM 
FQTDADFYQQYGPGGTWPAGQVLQAGELVFRPRDGAEGQPALAE 
TSPTAPECPPPAE 


5990 


2 


4700 


FGPG PDSGGGARGSGWGSR SQAP YGTLGAVSGGEQ VLLHEEAGD 
SGFVSLSRLGPSIiRDKDLEMEELMLQDETLLGTMQSYMDASIilS 
LIEDFGSLGEVEMSIjPDPSWDFSPPSFLETSSPKIjPSWRPPRSR 

prwgqspppqqrsdgeeeeevasfsgqilageldncvss I PDFP 
mhlac p ee ed kataaemavp aagde s i sslselvramhpyclpn 
lthlasledelqeqpddltlpegcwleivgqaatagddleipv 
vvrqvspgprpvlijddsletssalqllmptiteseteaavpkvtl 
csekeglslnseekldsacllkprewepwpkepqnppanaap 

GSQRARKGRKKKSKBQPAACVEGYARRIjRSSSRGQSTVGTEVTS 
QVDNLQKQPQEELQKESGPtiQGKGKPRAWARAWAAALENSS PKN 
LERSAGQSS PAKEGPLDLYPKLADTIQTNPI PTHLSLVDSAQAS 
PMPVDSVEADPTAVGPVLAGPVPVDPGLVDIiASTSSELVEPLPA 
EPVLIWPVLADSAAVDPAVVPISDNLPPVDAVPSGPAPVDLALV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenyl alanine, G=Glycine r 
H^Histidine, I=Isoleucine, Ksfcysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R-Axginine, 
S= Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *«Stop 
Codon, /^poasible nucleotide deletion, 
\=possible nucleotide insertion) 








DP VPNDLTP VDP VLVKS RPTDPRRGAVSS ALGGSAPQLLVES ES 
LDPPKTIIPEVKEWDSLKIESGTSATTHEARPRPLSLSEYRKR 
RQQRQAETEKRS PQPPl'GKWPSLPETPTGLADI PCLVI PPAPAK 
KTALQRSPETPLEICLVPVGPSPASPSPEPPVSKPVASSPTEQV 
PSQEMPLLARPSPPVQSVSPAVPTPPSMSAAiPFPAGGLGMPPS 
LPPPPLQPPSLPLSMGPVLPDPFTHYAPLPSWPCyPHVSPSGYP 
CLPPPPTVPLVSGTPGAYAVPPTCSVPWAPPPAPVSPYSSTCTY 
GPI*GWGPGPQHAPFWSTVPPFFLPPASIGRAVPQPKMESRGTPA 
GPPENVLPLSMAPPLSLGLPGHGAPQTEPTKVEVKPVPASPHPK 
KKVSALVQSPQMKALACVSAEGVTVEEPASERLKPETQETRPRE 
KPPLPATKAVPTPRQSTVPKLPAVHPARLRKLSFLPTPRTQGSE 
DWQAFISEIGIEASDLSSLLEQFEKSEAKKECPPPAPADSLAV 
GNSGGVD I PQE KRPLDR LQAPELANVAGLTPPATPPHQLWKPLA 
AVS LLAKAKS PKS TAQEGTLKPEG VT3AK1 IPAAVRLQEG VHGPS 
RVHVG3GDHDYC\VRSRTPPKK\MPAI»LIPEVGSRWNVKRHQDI 
TI KPVLSLGPAAPPPPC IAASREPLDHRTS SEQADPSAP CLAPS 
SLLS PEAS PCRNDMNTRTPPEPSAKQRSMRC YRKACRSASPSSQ 
GWQGRl-iGRNSRSVSSGSNRTSEASSSSSSSSSSSRSRSRSLSPP 
HKRWRRSSCSSSGRSRRCSSSSSSSSSSSSSSSSSSSSRSRSRS 
PSPRRRSDRRRRYSSYRSHDHYQRQRVrjQKERAIBERRWFIGK 
I PGRMTRS ELKOR FSVFGET FFf*T T hpp VOfSTOOrv/5 vi w i? 

AFAAIESGHKIiRQADEQPFDIiCFGGRRQFCKRSYSDU3SNREDF 
D PAP VKSKFDS LD FDTLL KQAQKNLRR 


5991 


334 


1379 


RLSSHFSQCSPSIYCNTKFDKQGNVTSFERKKTELYQELGIjQAR 
DLRFQHVMS ITVRNNRI I MRMEYLKAVITPECIiLI LDYRNliNLK 
QWLFR3LPSOLSGEGOLVTYPLPFEFRATEALLClYWTNTT,nf3Tfr. 
SILQPLILETLDALGDPKHSSVDRSKL*HILLQNGKSLSELETDI 
. KI FKE S ILE I LDEEE LLEB LCVS KWS DPQ VFEKS SAG I DHAEEM 
ELLLEN YYRLADDLSNAARELRVL IDDSQS 1 1 F INLDSHRNVMM 
RLNLQLTMGTFSLSLFGLMGVAFGMNXjESSLE E DHRIFWL I TG I 
MFMGS GLIWR RLLS FLGR / LARS S I A S YGMKDMVHGG I VEGL 


5992 


2 


609 


AGPDFRLVCGVSGSGFPGGRQGQATEWRPLRPWNGAMEKLRRVL 
SGQDDEEQGLTAQDSQINL/SEVLDASSLSFNTPJLKWFAICFVC 
GVFFS ILGTGLLWLPGGI KLFAVFYTLGNLAALASTCFLMGPVK 
QLKKMFEATRLLATIVMLLCFIFTLCAALWWHKKGLAVLFCILQ 
FLS MTW YSLS YI P YARDAVI KCC SS LLS 


5993 


1650 


594 


AEGLGSWAVWAGLGWAGRHMEAGGATGALGVGCKLPSAFCFPGS 
S VAMDMFQKVE KIGEGTYGWYKAKNRE TGQLVAIiKKI RLDLEM 
EGVPSTAIREISLLKELKHPNIVRLLDWHNERKLYIjVFEFLSQ 
DLKKYMDSTPGSELPLHL IKS YLFQLLQGVS FCHS HR VIHRDLK 
PQNLLINELGAIKLADFGLARAFGVPIiRTYTHEWTIjWYRAPEI 
LLATRFYTTAVDIWS IGC I FAEMVTRKALFPGDS\E IDQ\ LFRI 
FRMLGTPSEDTWPGVTQLPDYKGSFPKWTRKGLEEIVPNLEPEG 
RDLLMQLLQYDPSQR ITAKTALAHP YFS SPEPS PAARQ YVLQRF 
RH 


S994 


394 


1334 


AGEVQLHVWI RGMRIQPQ/KAAAI I DLDPDFE PQSR PRSCTWPL 
PRPE I ANQPS KPPEVEPDLGEKVHTEGRSEP ILLPSRLPEPAGG 
PQPGILGAVTGPRKGGSRRNAWGNQSYAELISQAIESAPEKRLT 
LAQIYEWMVRTVPYFKDKGDSNSSAGWKNSIRHNLSLHSKFIKV 
HNEATGXS SW WMLNPEGGKSG KAPRRRAASMDS S SKIiLRGRS KA 
PKKKPSGLPAPPEGATPTSPVGHFAKWSGSPCSRNREEADMWTT 
FRPRSSSNASS VSTRLS PIiRPESEVIiAEEI PASVSSYAGGVPPT 
LNEGLELLDGLNLTS SHSLLSRSGLSGFS LQH PG VTG PIiHT YS S 
SLFSPAEGPLSAGEGCFSSSQALEALLTSDTPPPPADVLMTQVD 
Pir.SQAPTLLLLGGLPSSSKLATGVGLCPKPLEAPGPSSLVPTL 
SM I APPPVMASAPI PKALGTPVLTP PTEAASQDRMPQDLDLDMY 
MENLECDMDNI ISDLMDEGEGIiDFNFEPDP 


5995 


2 


2437 


RP PGPG P ASGAWLCTRARGSAAF VP PIiPRPPSRGARRRRRLPGR 
GVAALRRGPGSAPGL PRGRAERS AAGS G RGPSREERGAAAAAAA 
AEMMEELHSL\DP \RRQELLEARF\TGLG VSKGPLNS ES SNQS L 
CSVGSLSDKEVETPEKKQNDQRNRKRKAEPYETSQGKGTPRGHK 
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